Are social equivalences ever regular?

Are social equivalences ever regular?

Social Networks 23 (2001) 87–123 Are social equivalences ever regular? Permutation and exact tests John P. Boyd a,∗ , Kai J. Jonas b a Department of...

329KB Sizes 0 Downloads 28 Views

Social Networks 23 (2001) 87–123

Are social equivalences ever regular? Permutation and exact tests John P. Boyd a,∗ , Kai J. Jonas b a

Department of Anthropology, University of California at Irvine, 3151 Social Science Plaza, Irvine, CA 92697, USA b Georg-August-Universitaet, Goettingen 37073, Germany

Abstract A regular equivalence on a relation induces matrix blocks that are either 0-blocks or regular-blocks, where a regular-block contains at least one positive entry in each row and column. The authors devise both a permutation test and an exact statistical test that separates these two aspects of regular equivalence, 0-blocks and regular-blocks. To test for the regular-block property, the natural test statistic is the number of rows and columns within each purported regular-block that fail to meet the criteria of having at least one positive entry. This statistic is computed for permutations that fix each regular-block as a whole (alternatively, within each sub-row), except for diagonal blocks, for which the diagonal entries are individually fixed. The exact test is derived by assuming that the number of zeros in each block is fixed and that each permutation of zeros is uniformly distributed. This implies that the probability of finding, say, k zeros in a given set of rows and columns follows the hypergeometric distribution, known in physics as the Fermi–Dirac statistics. These results from the separate blocks are combined by convolution to give the distribution of k zero vectors in the matrix as a whole. These tests were applied to data sets from Sampson’s Monastery, Wasserman and Faust’s Countries Trade Networks, Krackhardt’s High-Tech Managers, and B.J. Cole’s Dominance Hierarchies in Leptothorax ants. In all four cases, the 0-blocks were very significant, having only a tiny fraction of permutations with fewer errors than was found in the data. With the regular-blocks, however, there was no significant relation in the Countries data and a significant overall tendency in the other three data sets toward having more departures from regular 1-blocks in the data than in the permuted matrices. © 2001 Elsevier Science B.V. All rights reserved. Keywords: Fermi–Dirac statistics; Regular equivalence; Permutation tests; Exact tests

1. Introduction A social network can be described by one or more square matrices whose entries measure various social ties. The labels of these matrices represent individuals. These individuals can ∗ Corresponding author. E-mail address: [email protected] (J.P. Boyd).

0378-8733/01/$ – see front matter © 2001 Elsevier Science B.V. All rights reserved. PII: S 0 3 7 8 - 8 7 3 3 ( 0 1 ) 0 0 0 3 2 - 6

88

J.P. Boyd, K.J. Jonas / Social Networks 23 (2001) 87–123

be partitioned into social roles according to various principles. One of the most common such principle is called “regular” equivalence. The purpose of this paper is to test the empirical validity of this regularity concept, controlling for density within each block induced by the equivalence relation. A regular equivalence on the labels of a matrix can be defined as an equivalence that induces matrix blocks that are either 0-blocks or regular-blocks, where a 0-block contains only zeros, and where a regular-block contains at least one positive entry in each row and column. This definition allows for matrices over the non-negative reals, instead of just 0–1 matrices. A blockmodel specifies which blocks are supposed to be 0 or regular. Many authors have contributed to the theory of regular equivalence (White et al., 1976; Sailer, 1978; Pattison, 1982, 1993; White and Reitz, 1983; Kim and Roush, 1984; Reitz and White, 1989 (from a 1980 conference), Boyd, 1991; Everett and Borgatti, 1993; Boyd and Everett, 1999). The classic conceptual example for regular equivalence is the “gives orders to” relation among hospital staff with the equivalence classes of “nurses and doctors”. If nurses do not usually give orders to doctors, then this would be a 0-block. Similarly, if each doctor gives orders to some nurse, and every nurse receives orders from some doctor, then the other off-diagonal block would be a regular-block, with the associated rights and duties of such a role. Finally, we might suppose that both diagonal blocks are zero, since neither doctors nor nurses give orders to their own kind. Of course, a finer-grained analysis would find rank differences among sub-classes of doctors on the one hand, and nurses on the other. While the usage in sociology is not entirely consistent, we follow Wasserman and Faust’s (1994) interpretation, where, for example, the “position” of a “nurse” involves several “roles” vis-a-vis other positions, such as “doctor or patient”. More formally, let I be a set of individuals and P be a set of social positions. Every position function π : I → P that assigns positions to individuals, determines an equivalence relation on individuals in the natural way: i and j are π-equivalent if and only if they have the same position, i.e. if and only if π(i) = π(j ). Ordered pairs of positions are called roles. A relation on I can be represented as an I × I matrix A of “truth values” from the set N of non-negative integers that represent the strength of ties. That is, a relation is a function A : I × I → N. If (b, c) is a role, then a block B of the matrix A is just the restriction of A, considered as a function, to that subset of ordered pairs (i, j) such that (π(i), π(j )) = (b, c). That is, the domain of a block is the inverse image of a role. Next, the image matrix M is a P × P binary matrix. Typically, Mb,c = 0 if the number of zeros in the corresponding block of A is less than expected according to some empirical or a priori model, otherwise, Mb,c = 1. Finally, a blockmodel is a system (I, A, P, π, M), of individuals, a relation, a set of position, a position function, and an image matrix, as described above.

2. Data We examined four classic social networks, chosen for their intrinsic interest and diversity. Another requirement was that each data set be accompanied by a blockmodel: it would not be statistically valid for us to search for the most regular blockmodel and then test for its significance, unless this search was itself incorporated into the test. This incorporation of a search algorithm with a permutation test has been done by Boyd (2000), confirming the

J.P. Boyd, K.J. Jonas / Social Networks 23 (2001) 87–123

89

Table 1 Sampson Monastery: positive relations from T4 (1–7, 8–15, 16–18) and so on for seven more matrices (18×18×4) WBB code

Monk name

White, Boorman, and Breiger code 10

5

9

6

4

11

8

12

1

2

14

15

7

16

13

3

17

18

– 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

1 – 1 1 3 2 0 0 0 0 0 0 0 0 2 0 0 0

1 1 – 2 0 0 1 0 0 0 0 0 0 0 0 0 0 0

0 0 0 – 1 0 2 0 0 0 0 0 0 0 0 0 0 0

3 3 0 3 – 0 3 0 0 0 0 0 0 0 0 0 0 0

0 2 0 0 2 – 0 0 0 0 0 0 0 0 0 0 0 0

0 0 3 0 0 3 – 0 0 0 0 0 0 0 0 0 0 0

0 0 2 0 0 0 0 – 1 2 1 2 1 0 0 0 0 0

0 0 0 0 0 0 0 3 – 3 3 0 0 0 0 3 0 0

0 0 0 0 0 0 0 2 0 – 0 3 3 3 0 0 1 1

0 0 0 0 0 1 0 0 2 0 – 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 2 – 0 1 0 0 0 0

0 0 0 0 0 0 0 1 0 1 0 2 – 2 1 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 2 – 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 – 2 0 0

0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 – 2 2

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 – 3

0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 2 3 –

D14 , p. 470 Esteem (esteem) 10 Romuald –

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

D13 , p. 469 Affective (like) 10 Romuald 5 Bonaventure 9 Ambrose 6 Berthold 4 Peter 11 Louis 8 Victor 12 Winfrid 1 John Bosco 2 Gregory 14 Hugh 15 Boniface 7 Mark 16 Albert 13 Amand 3 Basil 17 Elias 18 Simplicius

conclusions reached here. The four networks were the Sampson Monastery, the Countries Trade Network, the Krackhardt High-Tech Managers data, and B.J. Cole’s Dominance Hierarchies in Leptothorax ants. While these data are too extensive to reproduce here, sample matrices from each of the data sets are presented in Tables 1–6. The Sampson Monastery data (Sampson, 1969, pp. 469–472) has probably been analyzed more than any other social network. Why should we try it yet again? For one thing, a new technique should be used first on old data, so that it may be compared with other methods Secondly, even this classic case requires a modification of the standard definition of regular equivalence, since the data is not binary in its raw form. Thirdly, there has been a great deal of confusion about the Sampson data, which we hope to clear up by going directly to his thesis. Finally, White et al. (1976) present a blockmodel, that we can test. The classic Sampson data is from “time period 4,” T4 , and includes four distinct relations, found in Sampson’s Tables D13 –D16 , labeled the “Affective”, “Esteem”, “Influence”, and “Sanction” matrices, respectively. Each matrix is 18 × 18 with integer entries from −3 to +3, except for blanks on the diagonals. The instructions for the Affective relation were List those three brothers whom you personally like the most: Like the most Like 2nd most Like 3rd most

90

J.P. Boyd, K.J. Jonas / Social Networks 23 (2001) 87–123

Table 2 Trade of basic manufactured goods between Countries and so on for four more matrices (24 × 24 × 1)a – 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 0 0 0

1 – 1 1 1 1 1 1 0 1 1 1 1 1 1 0 1 0 0 0 1 1 0 0 a

1 1 – 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 1 1 1 0 0 1

1 1 1 – 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0

1 1 1 1 – 1 1 1 0 1 1 0 1 1 0 1 1 0 0 0 0 0 0 0

1 1 1 1 1 – 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0

1 1 1 1 1 1 – 1 1 1 1 1 1 1 1 0 1 0 0 0 1 1 0 0

1 1 1 1 1 1 1 – 0 1 1 1 0 1 1 1 1 0 0 0 1 0 0 0

1 0 1 1 1 0 1 1 – 1 1 0 0 1 0 0 0 0 0 0 1 0 0 0

1 1 1 1 1 0 1 0 1 – 1 0 0 1 1 0 0 0 1 0 0 0 0 0

1 1 1 1 1 1 1 1 1 1 – 0 1 1 1 0 1 0 0 0 1 0 0 0

1 1 1 1 1 1 1 1 0 1 1 – 1 1 1 0 0 0 0 0 1 0 0 0

1 1 1 1 1 1 1 1 1 1 1 1 – 1 1 0 0 0 0 0 0 0 0 0

1 1 1 1 1 1 1 1 1 1 1 0 1 – 1 0 1 0 0 0 1 0 0 0

1 1 1 1 1 1 1 1 1 1 1 1 1 1 – 0 0 0 0 0 1 0 0 0

1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 – 0 0 0 0 0 0 0 0

1 1 1 1 1 1 1 1 0 1 1 1 0 1 0 0 – 0 0 0 0 0 0 0

1 1 1 1 1 0 1 1 0 1 1 0 1 1 1 0 0 – 0 0 0 0 0 0

1 1 1 0 1 0 1 0 1 1 1 0 0 1 0 0 0 0 – 0 0 0 0 0

1 1 1 1 1 0 1 0 0 1 1 0 0 1 0 0 0 0 0 – 0 0 0 0

1 1 1 0 0 0 1 1 1 1 1 0 0 1 1 0 0 0 0 0 – 0 0 0

1 1 1 1 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 0 1 – 0

1 1 1 1 1 0 1 0 0 1 0 0 1 1 0 0 0 0 0 0 0 0 – 0

1 1 1 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 –

Wasserman and Faust (1994), p. 749, Table B.12.

followed by a similar form for “like the least”. Note that “like the most” was coded as a3, not a1, while “like 3rd most” was a1. Similarly, “like the least” was a − 3. The other three relations were elicited in the same manner, except for the substitution of “esteem”, “influence”, and “support, praise and/or help” for “like”. Although the positive and negative arcs are listed in the same table by Sampson, it is usual to split the four matrices into eight on the basis of “positive” versus “negative” relations, changing the negative signs to positive. This seems perfectly reasonable, and is in fact what has been done in our Table 1, which gives only the first of eight Sampson T4 matrices. Other data reductions, however, such as collapsing the values 1, 2, and 3 onto 1 may result in an unnecessary loss of information. Another source of confusion is that White et al. (1976) rearranged the Sampson (1969) matrices to conform with their blockmodel. While this again is reasonable, subsequent authors have sometimes not been clear as to which ordering was being presented. The numbers in White et al. (1976) are nothing more than the rank order, from 1 to 18, in which they appeared in Sampson (1969), who had yet another numbering system. If you are not looking at an actual copy of Sampson’s (1969) thesis, we suggest you apply the Romuald–Basil Affective test. • Nobody likes Romuald. • Both he and Basil make four Affective choices, the others making only the requested three.

J.P. Boyd, K.J. Jonas / Social Networks 23 (2001) 87–123

91

Table 3 Krackhardt’s High-Tech Managers: friendship (Wasserman and Faust ordering) 21 × 21 × 21 and so on for 20 more matrices, plus 21 more for the “advice” relation 3

5

ACTOR 1 – 1 0 – 1 1 0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 ACTOR 2 – 0

9

13

15

19

20

1

4

7

8

10

16

18

21

2

6

11

12

14

17

0 1 – 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0

0 1 1 – 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 1 0 0 – 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

1 1 1 1 1 – 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0

0 0 0 0 0 0 – 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 – 1 0 1 0 1 0 0 1 0 0 1 0 0

0 0 0 0 0 1 0 1 – 0 1 0 0 0 0 0 0 1 1 0 0

0 0 0 0 0 0 0 0 0 – 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 1 1 0 – 0 0 0 0 0 0 0 1 0 0

0 0 0 0 0 0 0 0 0 0 0 – 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 1 0 0 0 0 – 0 0 1 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 – 1 1 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 1 – 1 0 0 0 0 1

0 0 0 0 0 0 0 1 0 0 0 0 1 1 1 – 0 0 0 0 1

0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 – 1 1 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 – 0 0 0

0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 1 0 – 0 1

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 – 0

0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 –

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

1

0

Table 4 Dominance hierarchies in Leptothorax ants, 16 × 16, bipartition plus another 13 × 13 matrix – 0 0 0 0 0 0

43 – 0 0 0 0 0

19 45 – 0 0 0 0

18 20 3 – 0 0 0

1 3 1 2 – 0 0

2 1 5 3 1 – 0

2 5 3 2 0 2 –

1 1 1 2 0 1 0

0 0 2 1 0 0 0

0 0 0 0 0 0 0

0 0 0 1 0 0 0

0 1 0 0 0 0 0

0 1 0 0 0 1 0

1 0 0 0 0 0 0

1 0 0 0 0 0 0

2 1 0 0 0 0 1

0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0

– 0 0 0 0 0 0 0 0

0 – 0 0 0 0 0 0 0

0 0 – 0 0 0 0 0 0

0 0 0 – 0 0 0 0 0

0 0 0 0 – 0 0 0 0

0 0 0 0 0 – 0 0 0

0 0 0 0 0 0 – 0 0

0 0 0 0 0 0 0 – 0

0 0 0 0 0 0 0 0 –

92

J.P. Boyd, K.J. Jonas / Social Networks 23 (2001) 87–123

Table 5 Sampson Monastery, positive relations from T4 Name

S0

N

P[S = S0 ]

P[S > S0 ]

E[S]

S.D.[S]

Z

G1

LikeR EstmR InflR PraiR

1 0 1 3

10000 10000 10000 10000

0.2943 0.5481 0.4459 0.3442

0.0508 0.4519 0.2360 0.1413

0.40 0.53 0.96 2.51

0.60 0.65 0.82 0.96

1.01 −0.82 0.05 0.51

1.31 0.97 0.55 0.33

TotalR

5

40000

0.4081

0.2200

4.39

1.54

0.39

0.31

LikeW EstmW InflW PraiW

1 2 3 9

10000 10000 10000 10000

0.3651 0.2310 0.1881 0.0098

0.1127 0.0755 0.0709 0.0009

0.61 1.12 1.83 5.37

0.73 0.94 1.10 1.32

0.54 0.95 1.06 2.74

1.05 0.65 0.37 0.18

TotalW

15

40000

0.1985

0.0650

8.92

2.09

2.91

0.20

LikeZ EstmZ InflZ PraiZ

16 20 22 12

10000 10000 10000 10000

0.0000 0.0000 0.0000 0.0000

1.0000 1.0000 1.0000 1.0000

76.21 73.45 72.68 52.82

6.81 6.79 6.71 5.85

−8.85 −7.87 −7.55 −6.98

−0.05 −0.10 −0.13 −0.14

TotalZ

71

40000

0.0000

1.0000

275.17

13.10

−15.66

−0.05

This rule of thumb should tell you whether your version of the Sampson data contains one of the common faulty labeling schemes. The Countries Trade Network can be found in Wasserman and Faust (1994). It consists of five 24 × 24 binary matrices of relations between 24 selected countries from Algeria Table 6 Analysis of Sampson Monastery, negative relations from T4 Name

S0

N

P[S = S0 ]

P[S > S0 ]

E[S]

S.D.[S]

Z

G1

LikeR EstmR InflR PraiR

15 14 11 19

10000 10000 10000 10000

0.0422 0.0158 0.1943 0.0058

0.0234 0.0051 0.4683 0.0018

11.53 9.50 11.36 13.80

2.01 1.95 1.99 1.92

1.73 2.30 0.18 2.71

0.01 0.06 0.05 0.04

TotalR

59

40000

0.0645

0.1246

46.19

3.94

3.25

0.02

LikeW EstmW InflW PraiW

27 24 19 34

10000 10000 10000 10000

0.0000 0.0001 0.0928 0.0005

0.0000 0.0000 0.0770 0.0004

17.38 16.42 16.44 27.90

2.26 2.00 2.18 1.91

4.25 3.78 1.18 3.20

0.00 0.07 0.01 0.04

TotalW

104

40000

0.0233

0.0193

78.14

4.18

6.18

0.01

LikeZ EstmZ InflZ PraiZ

5 7 1 7

10000 10000 10000 10000

0.0000 0.0000 0.0000 0.0002

1.0000 1.0000 1.0000 0.9997

29.36 35.15 31.42 25.39

6.26 6.68 6.57 5.95

−3.90 −4.21 −4.63 −3.09

0.10 0.08 0.08 0.12

TotalZ

20

40000

5.e–05

0.9999

121.31

12.74

−7.95

0.05

J.P. Boyd, K.J. Jonas / Social Networks 23 (2001) 87–123

93

to the former Yugoslavia. These five relations are found in (Wasserman and Faust (1994, pp. 749–753), Tables B.12, trade of basic manufactured goods; B.13, trade of food and live animals; B.14, trade of crude materials, excluding food; B.15, trade of minerals, fuels, and other petroleum products; and B.16, exchange of diplomats. Wasserman and Faust (1994, pp. 404–406) propose three distinct blockmodels for this data, one for the middle three matrices (“raw materials”), and one for each of the other two. The Krackhardt High-Tech Managers data is also described by Wasserman and Faust (1994). There are two kinds of relations between the 21 managers in the company, “Advice” and “Friendship”. The data is just binary 0’s and 1’s, but the catch here is that instead of just two 21 × 21 matrices, there are two 21 × 21 × 21 matrices. That is, each of the 21 managers was asked not only who they personally sought “advice” from or “liked,” but they were also asked about the advice and liking patterns of the other 20 managers (Krackhardt, 1987). Finally, the Dominance Hierarchies in Leptothorar ants (Cole, 1981) data was observations of ritual dominance activities among 16 female Leptothorax allardycei ants over 18.2 h in a “queenright” colony. It would be fascinating if social network principles could be applied to social groups outside the usual confines of Chordata. Maybe these principles are to be found in any complex system, such as brains and power grids. Our ordering is the original except that ninth is placed seventh. These ants really need a more detailed structural model: a linear ordering of the first seven followed by a class of nine ants that are all dominated by the other seven. Still, this data does appear to be a very good example of a regular equivalence with two blocks, the linear order part and the ants they dominate. However, appearances can be deceiving.

3. Regular- and 0-blocks Some definitions of regular equivalence apply only to 0–1 matrices, which would exclude some important social networks, such as the Sampson data, which has integers from 0–3. This section gives a definition of regular equivalence that is general enough to cover these examples, although a more general treatment would consider matrices of semi-rings (Boyd, 1991; Batagelj, 1994). Given an I × I matrix A of non-negative integers and a position function π : I → P a block B is a 0-block if it is a zero matrix, i.e. if Bij = 0 for all i, j in its role. On the other hand, a block is regular if Bi+ ≡

 Bik > 0 k

and

Bj + ≡

 Bkj > 0

(1)

k

for all i, j in the domain of the block. Said in another way, the marginals of regular-blocks must be positive. This is equivalent to the usual definition that forbids 0-rows or 0-columns. Finally, the position function itself is said to be regular with respect to the matrix A if and only if each block is either zero or regular. If ≡ is the equivalence relation determined by a position function π, then ≡ is said to be a regular equivalence if and only if π is regular.

94

J.P. Boyd, K.J. Jonas / Social Networks 23 (2001) 87–123

4. Permutation tests A regular equivalence is defined by the two kinds of blocks it induces, 0-blocks and regular-blocks. The authors devised a permutation test that separates out these two aspects of regular equivalence for square matrices. Recall that a permutation test requires a test statistic and a specified permutation group (Hubert, 1987; Good, 1994). If the test statistic can be put in a quadratic form as in Mantel’s (1967) U:  U= Sij Tij (2) ij

where S and T are measures of, say, spatial and temporal similarity of where and when two people came down with cholera, then a “large” value of U might indicate that cholera spreads by contagion. The question is, is the “large” value of U due to chance? One answer is to look at a large sample of random permutations ρ of these individuals and compute new, randomized U’s:  Uρ = (3) Sij Tρ(i),ρ(j ) ij

which are then compared with the original U. If U is larger than, say, 95% of the Uρ , then we can accept the hypothesis of contagion at the 95% level. This is the so-called quadratic assignment procedure (QAP), and its beauty is that no parametric assumptions are made. Our permutation tests for 0-blocks are quadratic (in fact, even linear), but those for regular-blocks are not, so they do not fit into the more restricted QAP framework.

5. Parametric approximations to permutation statistics While it may seem perverse to approximate a non-parametric permutation distribution with a parametric distribution, this procedure may be necessary for large data sets. If a matrix is large enough, say, 1000 × 1000, then it may be possible to compute statistics for only 100 or so permutations. One might get all the statistics greater than the reference value, and yet the size of the z-score may suggest a true P-value closer to 0.01 than to 1%. In this case, it makes sense to approximate the situation with a continuous distribution function. While it is always legitimate to compute the z-score as a measure of departure from the mean value, it may be a mistake to take literally the P-value derived from the normal distribution. The normal curve fails to be a good approximation if, for example, the empirical distribution has any significant skewness. In fact, strongly positive skewness is found in several of the permutation tests in this paper, the most extreme of which is 3.25 (Table 7, manager M1, under the heading G1 ). One approach is to reduce the skewness by taking square roots or other transformations. However, Hubert (1987) suggests that the density distribution f of (2 + γ z)/γ be approximated by the γ distribution with parameters α = 4/γ 2 and β = γ /2: 2  2  2 + γ z (4/γ )−1 −2(2+γ z)/γ 2 (2/γ )4/γ f (z) = e (4) γ Γ (4/γ 2 )

J.P. Boyd, K.J. Jonas / Social Networks 23 (2001) 87–123

95

Table 7 Analysis of Krackhardt High-Tech Managers: advice, whole block equivalence Name

S0

N

P[S = S0 ]

P[S > S0 ]

E[S]

S.D.[S]

Z

G1

M1 M2 M3 M4 M5 M6 M7 M8 M9 M10 M11 M12 M13 M14 M15 M16 M17 M18 M19 M20 M21

1 8 4 2 4 22 5 6 3 7 17 4 18 4 13 22 28 6 10 10 0

10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000

0.0819 0.0001 0.0059 0.0767 0.0017 0.0013 0.0045 0.0005 0.2781 0.0015 0.0000 0.0270 0.0942 0.1757 0.0597 0.0941 0.0258 0.0029 0.0005 0.1561 0.8403

0.0027 0.0000 0.0002 0.0083 0.0000 0.0001 0.0006 0.0000 0.3567 0.0001 0.0000 0.0033 0.0721 0.1100 0.0374 0.0756 0.0092 0.0001 0.0004 0.1595 0.1597

0.09 2.18 0.93 0.53 0.83 15.34 1.66 1.26 3.04 2.27 8.53 1.34 15.47 2.74 9.98 19.55 24.04 1.79 3.99 8.59 0.17

0.29 1.20 0.88 0.67 0.78 2.16 1.03 1.03 1.40 1.26 1.93 1.03 2.08 1.43 1.98 2.07 1.92 1.16 1.64 1.95 0.40

3.13 4.86 3.50 2.17 4.08 3.09 3.23 4.59 −0.03 3.75 4.39 2.59 1.21 0.88 1.53 1.18 2.06 3.62 3.66 0.72 −0.42

3.25 0.39 0.76 1.08 0.68 −0.01 0.39 0.69 0.26 0.41 0.07 0.61 0.04 0.38 0.05 0.01 0.02 0.52 0.27 0.08 2.27

Total

194

210000

0.0918

0.0474

124.31

6.69

10.42

0.01

where γ is the coefficient of skewness and −2/γ < z < ∞. In other words, Hubert takes a standard γ distribution with the given parameters and then shifts it to the left by 2/γ , resulting in a mean of zero and retaining its variance of one. Fig. 1 compares the cdf’s of a standardized normal with that of Hubert’s distribution, where the skewness γ is set equal to the large value of 3.25 from Tables 7–12 . Notice that Hubert’s distribution is 0 at z = −2/γ = −0.6154, while the normal is still at 0.2691 and continuing smoothly down toward zero to the left, or lower, tail, Moving toward the right, however, Hubert’s distribution rises rapidly until at z = 0, it is 0.21 points higher than the normal (which is 0.5). Finally, the curves cross again and reach another local maximum separation of 0.0286 at z = 2.2528, where the normal equals 0.9756 and Hubert’s distribution is 0.9526. This example shows that using the normal distribution as an approximation to Hubert’s distribution can result in significant errors, especially on the lower tail. However, the normal approximation is much better for 0 < γ < 1. In addition, Hubert’s distribution is not defined for γ < 0. In conclusion, Hubert’s distribution should be reserved for empirical data with large positive skewness. When several data sets are combined, as in this paper, skewness is reduced, reducing the need for Hubert’s distribution. We shall report the raw z-scores, and prudently interpret P-values drawn from either Hubert’s or the normal distribution.

96

J.P. Boyd, K.J. Jonas / Social Networks 23 (2001) 87–123

Fig. 1. A comparison of normal (0, 1) and Hubert (3.25) cdf’s.

Table 8 Analysis of Krackhardt High-Tech Managers: advice, 0-blockmodel Name

S0

N

P[S = S0 ]

P[S > S0 ]

E[S]

S.D.[S]

Z

G1

M1 M2 M3 M4 M5 M6 M7 M8 M9 M10 M11 M12 M13 M14 M15 M16 M17 M18 M19 M20 M21

147 32 99 90 85 18 58 52 78 71 27 48 14 69 55 20 17 63 42 37 83

10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000

0.0608 0.0000 0.0772 0.0025 0.0019 0.0037 0.0000 0.0001 0.0738 0.0298 0.0002 0.0000 0.0003 0.0739 0.0740 0.0326 0.0321 0.0047 0.0014 0.0175 0.0000

0.2265 1.0000 0.4784 0.9942 0.9960 0.9933 1.0000 0.9999 0.6167 0.9044 0.9997 0.9999 0.9996 0.6643 0.2424 0.9295 0.9362 0.9901 0.9964 0.9564 1.0000

143.78 57.07 99.20 103.31 98.60 26.97 81.42 69.59 79.93 77.85 40.00 67.98 25.43 71.56 52.41 25.43 22.26 74.78 54.51 44.65 102.78

4.91 4.47 5.12 5.13 5.05 3.43 4.95 4.78 4.98 4.86 3.98 4.78 3.32 4.86 4.39 3.31 3.09 4.85 4.44 4.16 5.16

0.66 −5.61 −0.04 −2.59 −2.69 −2.62 −4.73 −3.68 −0.39 −1.41 −3.27 −4.18 −3.45 −0.53 0.59 −1.64 −1.70 −2.43 −2.81 −1.84 −3.83

0.00 −0.01 −0.03 −0.01 0.03 0.00 0.01 0.00 −0.05 0.04 −0.01 0.02 −0.03 0.00 −0.01 −0.05 0.00 0.00 0.02 0.00 −0.01

Total

1205

210000

0.0232

0.8535

1419.52

20.74

−10.35

0.00

J.P. Boyd, K.J. Jonas / Social Networks 23 (2001) 87–123

97

Table 9 Analysis of Krackhardt High-Tech Managers: friendship, whole-block equivalence Name

S0

N

P[S = S0 ]

P[S > S0 ]

E[S]

S.D.[S]

M1 M2 M3 M4 M5 M6 M7 M8 M9 M10 M11 M12 M13 M14 M15 M16 M17 M18 M19 M20 M21

27 50 56 45 20 44 22 62 58 39 18 42 34 24 38 50 29 46 15 58 37

1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000

0.0430 0.1860 0.2190 0.0060 0.0000 0.0000 0.0010 0.7730 0.7670 0.0010 0.0020 0.1280 0.1650 0.0090 0.0010 0.0140 0.0020 0.1350 0.0010 0.2440 0.0000

0.0070 0.0460 0.0250 0.0000 0.0000 0.0000 0.0000 0.2270 0.2330 0.0000 0.0010 0.0570 0.1430 0.0010 0.0000 0.0000 0.0020 0.0590 0.0010 0.0330 0.0000

23.68 48.72 54.97 40.66 11.10 38.69 16.04 62.23 58.23 32.87 10.87 40.18 32.57 18.54 31.86 47.01 22.48 44.29 8.48 57.04 30.45

1.78 1.08 0.79 1.65 2.15 1.60 1.71 0.42 0.42 1.84 2.24 1.47 1.88 2.24 1.59 1.13 2.23 1.43 2.04 0.80 1.85

Total

814

21000

0.1284

0.0398

730.95

7.50

Z 1.86 1.19 1.31 2.63 4.13 3.31 3.49 −0.54 −0.55 3.33 3.18 1.23 0.76 2.44 3.85 2.65 2.92 1.20 3.20 1.19 3.53 11.07

G1 −0.15 0.08 0.36 −0.07 0.10 0.19 0.21 1.31 1.27 0.09 0.10 0.20 0.08 −0.03 −0.01 0.13 0.14 0.13 0.20 0.31 0.07 0.02

6. Permutation tests for 0- and regular-blocks To use a permutation test to determine whether a given blockmodel correctly identifies the 0-blocks, a natural test statistic is the sum of the entries in all the presumed 0-blocks, while the permutations are those that individually fix the diagonal entries (since the diagonal is “structurally zero”). Note that these permutations do not preserve the row or column marginals. It is wrong in this case to restrict the permutations that preserve the marginals. To see this, consider a blockmodel where one has an entire row or column of 0-blocks, e.g. in the “doctor–nurse” blockmodel example discussed in the introduction, the “nurse”-row of the image matrix of the “gives orders to” relation consists of two 0’s. Similarly, the “doctor”-column is a pair of 0’s. If an empirical relation A had a small number of 1’s in these three 0-blocks, then a permutation that preserved the marginals would automatically produce a matrix with the same low number of ones in each of the 0-blocks. This means that the test statistic (the number of ones in 0-blocks) would be a constant over the group of marginal-preserving permutations, resulting in a trivial distribution. Clearly, the group of permutations has to produce some spread of the test statistic. Similarly, one might have more, or larger, 0-blocks in some rows or columns than in others. In this case, permutations that preserve the marginals would tend to maintain a lower than average number of zeros in these 0-blocks. In other words, since the marginals are part of the hypothesis (being a 0-block or not), or highly correlated with it, it is wrong to

98

J.P. Boyd, K.J. Jonas / Social Networks 23 (2001) 87–123

Table 10 Analysis of Krackhardt High-Tech Managers: friendship relation, 0-blockmodel Name

S0

N

P[S = S0 ]

P[S > S0 ]

E[S]

S.D.[S]

Z

G1

M1 M2 M3 M4 M5 M6 M7 M8 M9 M10 M11 M12 M13 M14 M15 M16 M17 M18 M19 M20 M21

20 11 1 18 20 10 25 3 2 22 22 11 11 11 5 14 8 8 19 5 24

1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000

0.0000 0.1730 0.0360 0.1490 0.0000 0.0310 0.0000 0.3180 0.2260 0.1130 0.0000 0.0790 0.0110 0.0000 0.0000 0.1610 0.0000 0.0710 0.0000 0.2790 0.0880

1.0000 0.3770 0.9610 0.4790 1.0000 0.9510 1.0000 0.2040 0.6790 0.5630 1.0000 0.8300 0.9850 1.0000 1.0000 0.2560 1.0000 0.8530 1.0000 0.3770 0.6670

30.82 10.81 3.70 18.40 35.33 14.92 40.09 2.56 3.09 23.01 36.43 13.97 16.90 24.73 17.53 12.88 21.04 10.90 38.08 5.11 25.81

3.58 2.6 1.28 2.85 3.72 2.64 4.02 1.11 1.24 3.22 3.85 2.50 2.73 3.33 2.81 2.42 3.09 2.25 3.90 1.53 3.32

−3.02 0.09 −2.10 −0.14 −4.12 −1.86 −3.75 0.40 −0.88 −0.31 −3.75 −1.19 −2.16 −4.12 −4.46 0.46 −4.22 −1.29 −4.90 −0.07 −0.55

0.04 0.06 0.01 0.07 −0.07 −0.04 0.08 −0.01 −0.01 −0.03 −0.03 −0.09 0.03 0.04 −0.06 −0.03 0.00 −0.10 0.08 0.07 −0.03

Total

270

21000

0.0826

0.7706

406.12

13.18

−10.33

0.00

hold the marginals constant. Note that, since minimizing the ones in 0-blocks is equivalent to maximizing the number of ones in the other blocks, this same argument would apply to the corresponding test statistic. This “marginal” issue also comes up when considering permutations within regular-blocks themselves. Table 11 Analysis of Wasserman and Faust Countries data Name

S0

N

P[S = S0 ]

P[S > S0 ]

E[S]

S.D.[S]

Z

G1

B12W B12Z B13 15W B13 15W B13 15W

2 51 3 2 40

10000 10000 10000 10000 10000

0.0967 0.0000 0.4848 0.1327 0.1487

0.0063 1.0000 0.3990 0.0195 0.1815

0.63 144.93 3.32 0.75 38.57

0.68 5.80 0.72 0.76 2.16

2.01 −16.20 −0.44 1.65 0.67

0.75 0.06 0.04 0.75 0.05

TotalW

45

30000

0.2554

0.2000

42.64

2.40

0.99

0.06

B13 15Z B13 15Z B13 15Z

64 56 24

10000 10000 10000

0.0000 0.0000 0.0000

1.0000 1.0000 1.0000

136.79 136.76 60.22

5.75 5.83 5.03

−12.65 −13.84 −7.20

0.03 −0.03 0.00

TotalZ

144

30000

0.0000

1.0000

333.78

9.61

−19.74

0.00

B16W B16Z

0 80

10000 10000

0.8263 0.0000

0.1737 1.0000

0.18 154.38

0.39 5.47

−0.45 −13.60

1.85 −0.03

J.P. Boyd, K.J. Jonas / Social Networks 23 (2001) 87–123

99

On the other hand, an example of the marginals not having to be fixed would be if one wanted to test whether the block sums (without any hypothesis of which are 0’s or 1’s) are different from chance. One could use the standard Chi-square test, but the permutation version (useful if the numbers are small) would be to permute the row and column labels simultaneously as in Eq. (3) and measure the Chi-square statistics. Note that this group of permutations would automatically fix the diagonal elements as a set, but not as individuals. To test for the regular-block property, the natural test statistic is the number of rows and Table 12 C++ code, using the standard template library

(Table Continued on next page)

100 Table 12 (Continued)

J.P. Boyd, K.J. Jonas / Social Networks 23 (2001) 87–123

J.P. Boyd, K.J. Jonas / Social Networks 23 (2001) 87–123

101

Table 12 (Continued)

(Table Continued on next page)

102

J.P. Boyd, K.J. Jonas / Social Networks 23 (2001) 87–123

Table 12 (Continued)

columns within each purported regular-block that fail to meet the criteria of having at least one positive entry. One could argue that it is wrong, for example, to count one 0-row and one 0-column as two errors, since the block could be made regular by a single change, viz. by putting a1 in the intersection. We think this argument is effectively refuted in Section 8. In addition, the number of 0-rows and 0-columns is easy to compute, so we shall use this statistic. The other part of any permutation test, the permutation group, will be the group of all permutations that fix each regular-block as a whole, except that for diagonal blocks, the diagonal entries are individually fixed. That is, entries are randomly shuffled within each regular-block, respecting diagonals, and then “errors” are counted. Here it would be even more wrong than it is in the 0-block case to try to “control for the marginals”, since the very definition of regularity is a statement about marginals, viz. that they are non-zero. A variant of this procedure was devised for data like that from the Sampson Monastery, where most of the rows were forced to have exactly three non-zero entries by the instructions to the subjects. That is, the lack of non-0-rows is no evidence for a tendency for regularity. In this case, permutations are only allowed within each row, again respecting the diagonals. Furthermore, only column errors were counted, since there should be no row errors.

J.P. Boyd, K.J. Jonas / Social Networks 23 (2001) 87–123

103

7. Exact probability distribution of errors in regular-blocks 7.1. Introduction We want to compute the exact probability of having a given number of 0-rows or 0-columns, given a uniform probability distribution on all arrangements of the observed number n0 of 0’s in a block. This model will confirm the results of the permutation test with the “whole-block” condition, but only for data that is, or can naturally be converted to, a binary matrix. These elementary, but extensive, calculations were carried out with the aid of Mathematica Version 4.0.0. There are two main cases to consider, diagonal versus off-diagonal blocks. Since the presence of structural 0’s on the diagonals is an added complication, we will consider the simpler off-diagonal case first.   n Recall the notation for the binomial coefficients , which counts the number of ways k you can pick a subset of size k from a set of size n. “Binomial” is a built-in Mathematica function that can compute the binomial coefficients to an arbitrary number of digits, limited only by computer time and storage, not by the size of a “long integer”. In addition, we can force Mathematica to use the standard mathematical notation. <mn || n0 > mn), 0,   n0 z   mn z //N]]

104

J.P. Boyd, K.J. Jonas / Social Networks 23 (2001) 87–123

In the expression above, the “Module” statement in Mathematica let us define the local variable z, which is the number of 0’s found in the all-zero r rows and c columns, computed according to the hypergeometric distribution. Another notational convention is that the underscores after the arguments on the left-hand side of the “definition” symbol, :=, mean that the arguments can be mathematical expressions, not just numbers. Finally, the “//N” at the end converts the fraction to a floating point number. To see why the formula is correct, first consider only the rows: it is clear that they use up rn zeros, since each row is of length n. Similarly, the c columns take up cm 0’s, but then we have to subtract off the cr 0’s already counted with the rows, giving the above formula for the number of z 0’s. Assuming a uniform probability distribution, and sampling replacement, these    without  n0 mn 0’s can be chosen different ways. The denominator is the number of ways z z of picking z items from the entire pool of mn numbers (zero and non-zero) in the block. Note that we have not yet accounted for the fact that the r rows and c columns might be chosen in many different ways, or that other rows or columns might also be zero. Later in this section, these details will be taken care of by the classical “method of inclusion and exclusion”. As noted in the previous section, one might argue that it is wrong to count the combination of a 0-row and a 0-column as two separate errors, since both errors could be corrected by a single change at their intersection. Although superficially plausible, this suggestion is not supported in the context of this probability model. It turns out that what is far more important is the size of the sub-row or sub-column under consideration. That is, if one really wanted to count some errors as more serious than others, one should look at their relative probabilities. For example, in a block with a large row-length n to column-length m ratio (say, 12–6, with n0 = 47), the probabilities of a specified row being all 0’s is much less than for a column. These probabilities can he computed by the Mathematica function defined above. p[1, 0, 47, 6, 12] 0.00340106 p[0, 1, 47, 6, 12] 0.0687253 So in this example, the probability of a 0-column is about 20 times that of a 0-row. For the actual 7 × 9 upper-right block in the ant1 data (Table 4) with 47 0’s, the rows and columns are much closer in length, so the effect is less, but is still sizeable, almost two to one. p[1, 0, 47, 7, 9] 0.0575742 p[0, 1, 47, 7, 9] 0.113672 The next three calculations illustrate that for the rectangular ant1 block, the probability of a single specified row and a single specified column (making a “cross”-pattern) is greater than the probability of two specified rows, but less than the probability of two specified columns.

J.P. Boyd, K.J. Jonas / Social Networks 23 (2001) 87–123

105

p[1, 1, 47, 7, 9] 0.00615414 p[2, 0, 47, 7, 9] 0.00176483 p[0, 2, 47, 7, 9] 0.00913797 This again shows that the relative length of rows and columns is more important than whether or not there is a way to make the block regular with a single change. In order to isolate the effect of the common cell in a cross-pattern, consider the case of a square block. The next series of calculations shows that for a hypothetical 8 × 8 square block with 47 0’s, the probability of any specified row–column pair being all 0’s is half again that of any specified pair of rows (or columns). This difference is due to the duplicate zero in the intersection of the row and the column. However, both probabilities are still an order of magnitude less than the probability of just one 0-row (or 0-column). It is also less than the difference between the row and column probabilities of the rectangular block above. p[1, 1, 47, 8, 8] 0.00471177 p[2, 0, 47, 8, 8] 0.00307707 p[1, 0, 47, 8, 8] 0.0710451 By writing out the binomial coefficients in terms of factorials and then canceling, it is easy to see that the general ratio of the cross-pattern to the two-row (or two-column) pattern of 0’s reduces to (n2 − 2n + 1)/(n0 − 2n + 1). If the block is half full of 0’s, i.e. n0 = n2 /2, then the cross/two row ratio is close to 2 for large n. Similarly, the ratio of the probability of a single specified 0-row to a pair of such rows is Perm[n2 , n]/Perm[n0 , n]. If again n0 = n2 /2, then this ratio is now greater than 2n . These calculations validate our practice of counting the crossing pattern as two errors instead of just one. 7.3. Errors in arbitrary rows and columns: off-diagonal blocks The next formula computes the sums Sk of all combinations adding up to k of the previously defined probabilities of 0-rows and 0-columns. Note that these sums are not themselves probabilities, since they may count the same events several times. In fact, they may be greater than one. However, they are useful in the inclusion–exclusion formulas for computing the actual probabilities. s[k , n0 , m , n ] :=

 k    m [n] r=0

r

k-r

p[r,k-r,n0,m,n]

The reasoning behind this formula is simple (Feller, 1957). If r is the number of 0-rows, then you have to have k − r 0-columns to get k total 0-vectors. Given that you want r rows,

106

J.P. Boyd, K.J. Jonas / Social Networks 23 (2001) 87–123



   m [n] ways of choosing them out of the m total rows, Similarly is r k−r the number of ways of choosing the k − r columns from the n total columns. Finally, p is the probability, defined above, of r particular rows and k − r particular columns being 0’s. Here are some examples. Note that S1 is greater than one. then there are

s[2, 47, 7, 9] 0.75374 s[1, 47, 7, 9] 1.42607 If we disable the tests in the definition of p, we get S1 in symbolic form.     n0 n0 m n n m  +   s[1,n0,m,n] =  mn mn m n Now let us substitute specific numbers for the variables to get a numerical answer.     n0 n0 m n n m   +   .{n- > 9.,m- > 7,n0 - > 47} mn mn m n 1.42607 This can be expressed more efficiently, using Perm, and we verify that the answer is the same. meanOffDiag[n0 Integer,m Integer,n Integer] := n Perm[n0 , m] + Perm[mn,n] m Perm[n0 , n] ; Perm[mn,n] meanOffDiag[47,7,9]//N 1.42607 Although S1 is not a probability, it does equal the expected value for the total number of 0-vectors. Here is the proof: let R1 , . . . , Rm be indicator variables for the event that the specified rows are all zero. That is, Ri = 1 if Bi,j = 0 for all j in the block B. Note that Pr[Ri = 1] = p[1, 0, n0 , m, n]. Similarly, define the indicator variables C1 , . . . , Cn for the columns being zero. The total number of 0-vectors can be represented as the sum of these indicator variables: R1 + · · · + Rm + C1 + · · · + Cn .

J.P. Boyd, K.J. Jonas / Social Networks 23 (2001) 87–123

107

Therefore, the mean of the 0-vector variable is the sum of the means of the indicator variables, and the result follows. In order to avoid using the full formula for Sk when the answer is going to be zero anyway, it will be convenient, though not strictly necessary, to know max0s the maximum number of possible 0-vectors in an m × n block with z 0’s. Clearly, this could be useful in choosing the limits of summation for Sk .  

√ max0s[z , m , n ] := Module   z root1s = If mn


If n ≤ root1s,Floor n , Floor[m − root1s] + Round[n − root1s] Recall that the floor function, often denoted by x, equals the greatest integer less than or equal to x. The derivation of max0s involves working with messy integer quadratic forms. One’s first guess would be that max0s = Max(z/m, z/n). The problem is that as the number of 0’s increases, then you can fit more 0-vectors if you mix rows and columns. Fig. 2 shows values of max0s with a fixed 10 × 10 block. Again following Feller (1957), the probability Pk that at least k of the events A1 , . . . , AN occur simultaneously is given by Pk =

N  i=k

(−1)

k−i



i−1 k−1

 Si .

In our case, N equals max0s, the maximum possible number of 0-vectors given the number of 0’s, rows, and columns.

Fig. 2. The maximum possible number of zero-vectors for a 10 × 5 off-diagonal block, as a function of the number z of zeros in the block.

108

J.P. Boyd, K.J. Jonas / Social Networks 23 (2001) 87–123

Table 13 Probability of at least k zero-vectors in a 7 × 9 off-diagonal block (antUpperRight) with 47 zeros k

P[k]

0 1 2 3 4 5 6 7 8 9

1 0.83727 0.443613 0.126531 0.0175785 0.00105483 0.000021932 9.46935 × 10−8 1.20362 × 10−11 0

pAtLeast[k , n0 , m , n ]:= If[k ≤ 0, 1,   max0s[n0,m,n] k-i i-1 s[i,n0,m,n] (−1) i=k k-1 ] We can make a table of values for k ranging from 0 to 9 for the upper-right-hand block of the ant1 data. Note the small number for k = 8. The numbers 1 and 0 are exact, indicated by a lack of decimal or following 0’s, as shown in Table 13. It is obvious that 1 − Pk is Fk −1 , where F is the cumulative distribution function. It is well known (Parzen, 1960, p. 211) that the mean of a distribution on the non-negative reals is ∞ i=0 (1 − Fi ). oneMinusCDF = Delete[Transpose[tablePk] [[2]], 1] {0.83727,0.443613,0.126531,0.0175785,0.00105483,0.000021932, 9.46935 × 10−8 ,1.20362 × 10−11 ,0} 1 – oneMinusCDF {0.16273,0.556387,0.873469,0.982421,0.999845,0.999978, 1.,1.,1} Length[oneMinusCDF] oneMinusCDF[[i]] i=1 1.42607 This shows that the observed pair of errors (a bad row plus a bad column) in the upper-right “regular” block is only a little above the expected number of 0-vectors (1.42607) if the 0’s were randomly distributed within the block. That is, having only two errors in this case is not evidence for regularity, considering the value of the cdf, F (2) = 0.873469. So far, the null hypothesis cannot be rejected. Notice also that 1.42607 is the previously computed mean. Still following in Feller’s (1957) footsteps, we can also compute the probability P[k] of the occurrence of exactly k among the N events, as shown in Table 14.

J.P. Boyd, K.J. Jonas / Social Networks 23 (2001) 87–123

109

Table 14 Probability of exactly k zero-vectors in a 7 × 9 off-diagonal (antUpperRight) block with 47 zeros k

P[k]

0 1 2 3 4 5 6 7 8 9

0.16273 0.393657 0.317082 0.108952 0.0165237 0.0010329 0.0000218373 9.46814 × 10−8 l.20362 × 10−11 0

P[k] =

N 

(−1)k−i

i=k

  i Si . k

pExactly[k ,n0 ,m ,n ] :=

[n0,m,n] max0s

(−1)k-i

  i s[i,n0,m,n] k

i=k The mean value for this distribution, m1 , is the inner product of the density function, antUpperRight, with the range [0, 1, . . . , 9] of non-zero density. m1 = antUpperRight.Range[0, 9] 1.42607 which agrees with the formula for the mean derived above and with the calculation above using Pk . Next, the second moment m2 , the expectation of the square of the number of regular errors, is given by the inner product of the density function with the squares of the range. m2 = antUpperRight.Map[#2 &, Range[0, 9]] 2.93355 Finally, the variance of this distribution is given below. m2 − m2 1 0.899874 Next, we find a closed-form expression for the variance of the number of regular errors for any off-diagonal block. Recall that the mean for the row and column indicator variables, Ri and Cj are r=

Perm[n0 ,n] ; Perm[mn,n]

c=

Perm[n0 ,m] . Perm[mn,m]

and

110

J.P. Boyd, K.J. Jonas / Social Networks 23 (2001) 87–123

Next, recall that the variance of a sum of random variables is the sum of their variances plus twice all possible covariances. Since a row indicator variable is a Bernoulli variable with parameter r, its variance is just r(1 − r). Similarly, the variance of each column indicator variable is c(1 − c). The covariance of two distinct row indicator variables is given by   n0 − n [n]  − r2 Cov[Ri ] = r  mn − n [n] where n is subtracted off to take account of the fact that n 0’s have already been used up. Similarly, the covariance of two distinct column indicator variables, is   n0 − m [m]  − c2 . Cov[Ci ] = c  mn − m [m] Finally, the variance of the sum of m row and n column indicator variables is given below. VarOffDiag[n0 Integer, m Integer, n Integer]:= Module[{r, c}, Perm[n0,n] //N; c = Perm[n0,m] //N; r = Perm Perm [mn,n] 

[mn,m] [n0−n,n] − r mr(1-r) + nc(1-c) + m(m-1)r Perm Perm[mn−n,n]     n0 − n

  m−1    − c + n(n − 1)c Perm[n0−m,m] − c  +2mnr  Perm[mn−m,m]  mn − n   m−1 Next, if we substitute values corresponding to the antUpperRight block, we see that we get the same answer, 0.899874, as before. varOffDiag[47, 7, 9] 0.899874 Naturally, it is very convenient having formulas for the mean and variance of regular errors, since most of the time the calculations of the entire distribution will now be unnecessary. Comparing with the previous table, we see that in this case P8 = P[8] because the probabilities for eight or greater are zero, i.e. the probability of “at least 8” equals that of “exactly 8”. Moving on to other data sets, let us first consider the Wasserman and Faust Countries Trade data B12 (1994). There are three off-diagonal regular-blocks, all with a high density of 1’s, shown in Tables 15–17. Let us compare these probabilities with the observed errors (0-vectors), 0, 0, and 2, respectively. The two 0’s are not strong evidence for regularity, since you would expect 0’s with probability 0.999978 and 0.818473, respectively. The two errors in the third block are

J.P. Boyd, K.J. Jonas / Social Networks 23 (2001) 87–123

111

Table 15 Probability of exactly k zero-vectors in a 5 × 7 off-diagonal (Countries) block with five zeros k

P[k]

0 1 2

0.999978 0.0000215629 0

Table 16 Probability of exactly k zero-vectors in a 7 × 7 off-diagonal (Countries) block with three zeros k

P[k]

0 1 2 3

0.818473 0.178818 0.00270898 0

actually strong evidence against regularity, because the probability of two or more 0’s is only 0.003222 + 0.000119 = 0.032341. In fact, this is strong support for the opposite of regularity, viz. that there are more 0-vectors than expected, not less. This surprising result will occur again and again in the data examined in this paper, with both exact calculations and Monte Carlo simulations. The 7 × 5 off-diagonal block, b12OffD21, has only five 0’s, so that we know there are only seven ways of distributing the 0-column (a 0-row would not be possible). Therefore, we can directly find the probability of having exactly (or at least) one 0-vector. 7 p[1, 0, 5, 7, 5] 0.0000215629 This agrees with the result from pExactly. 7.4. Errors for diagonal blocks Now that we know how to find the distribution of 0-vectors in off-diagonal blocks, we have to do the same for diagonal blocks. The presence of the diagonal in the diagonal block makes these calculations slightly more difficult. It is customary in social networks analysis to treat diagonal elements as “structural 0’s”, not to be included in either data gathering Table 17 Probability of exactly k zero-vectors in a 9 × 7 off-diagonal (Countries) block with three zeros k

P[k]

0 1 2 3 4

0.589426 0.378233 0.032222 0.000119076 0

112

J.P. Boyd, K.J. Jonas / Social Networks 23 (2001) 87–123

or its analysis. Since it is misleading to put 0’s on the main diagonal when in fact there is no information present at all, we have replaced diagonal 0’s with dashes in the data tables. Another advantage of dashes on the diagonal is that they make the matrices much easier to read. If we are considering the probability of a single 0-vector, then the adjustment for a diagonal block is simple: just reduce the length of the vector by one. On the other hand, the joint probability of a given row i and column j being zero depends on whether or not i = j . Suppose that the n × n diagonal block has n0 0’s (not counting the diagonal), and consider a specific (row, column) pair (i, j). If i = j , then subtracting one for both the row and the column intersecting the diagonal implies that the probability that both vectors are zero is   [n0 ] 2n − 2  . n2 − n 2n − 2 However, if i = j , then subtracting an additional 1 for double-counting the intersection on the off-diagonal implies that the probability that both vectors are zero is  

[n0 ] 2n − 3 n2 − n 2n − 3

 .

In general, the number of positions for 0’s taken up by r rows, c columns, with s in the intersection, is z = c(n − 1) + r(n − c − 1) + s. If we substitute r = k − c, then z = c2 + k(n − c − 1) + s. The probability of such an arrangement is  

n0 z



n2 − n [z]

.

The number of ways of picking rows and columns can be decomposed into cases corresponding to the size s of the overlap:       c∧r    c n−c n n n , = s r −s c r c s=0

where c ∧ r is the minimum of c and r. The sums sk of k = r + c events are given below.

J.P. Boyd, K.J. Jonas / Social Networks 23 (2001) 87–123

113

Table 18 The sums sk of zero-vectors in a hypothetical 4 × 4 diagonal block with three zeros k

sk

0 1 2 3 4

1 12/5 9/5 2/5 0

sDiag[k , n0 , n ]:= Module[{z},    k n  z = c2 + (−c + n − 1)k; c=0 c   

Min[c,k−c,n2 −n−z] s=0



   c  n − c n0   s k−c−s s+z       n2 − n 

s+z



] This can be computed for a simple example, which can be checked by hand and compared with Table 18. The reader can verify that there are only two possibilities: exactly two errors (both 1’s in the same row or column), with probability 3/5, or exactly three errors (both ones in different rows and columns), with probability 2/5. The mean and variance for this example are 12/5 and 6/25, respectively. As in the off-diagonal case, sDiag[1, n0, n] equals the expected value for the cell. The formula is   [n0 ] n−1 . 2n  n2 − n n−1 For example, substituting the values below in the formula 



[n0] n−1  meanDiag[n0 Integer,n Integer] := 2n  n2 − n 

n−1



meanDiag[4, 3] 12 5 matches our direct calculation above. Using the method of indicator variables, we see that the probability (and hence the mean) of either a row or a column vector v is given

114

J.P. Boyd, K.J. Jonas / Social Networks 23 (2001) 87–123

below.





[n0] n−1 ; v=  n2 − n 



n−1 v/.{n0 → 4, n → 3} 2 5 Note that the “−1” terms reflect the skipping of the diagonal element. The covariances of the indicator variables fall into three cases: (1) Cov[Ri , Rj ] or Cov[Ci , Cj ]; (2) Cov[Ri , Cj ] (i = j ); (3) Cov[Ri , Cj ] (i = j ). Case (1), where the vectors are either both rows or both columns, has covariance   n0 − n + 1 [n − 1]  − v2 Cov[Ri , Rj ] = Cov[Ck , Cl ] = v  (n − 1)2 n−1   n and there are 4 terms with this expression. For Case (2), the covariance is the same as 2 for Case (1), but for a different reason: the row and column intersect in the diagonal, which is not counted anyway. There are 2n such cases, which when added to the number of terms for Case (1), total 2n2 . Finally, Case (3), where i = j , is given by   n0 − n + 1 [n − 2]  − v2 Cov[Ri , Cl ] = v  (n − 1)2 n−2 because youhave  to take account of both the row–column intersection and the diagonal. n There are 4 terms like Case (3). Combining all these terms gives us the diagonal 2 variance, shown below.     n0 − n + 1   n−1   var = 2nv(1 − v) + 2n2 v   − v2    (n − 1)2 n−1     n0 − n + 1    n−2 n   +4  − v2  ; v  2 2   (n − 1) n−2 Continuing the simple example above, we see that it agrees with the our hand calculations. var/.{n0- > 4, n- > 3} 6 25

J.P. Boyd, K.J. Jonas / Social Networks 23 (2001) 87–123

115

The expression for varD is can be simplified using the Perm function.       n0     n−1   //N , varDiag[n0 Integer,n Integer] := Module  v =  2   n −n     n−1    2nv (1 − v) + n Perm[n0−n+21,n−1] − v Perm[(n−1) ,n−1]   Perm [n0−n+1,n−2] +(n − 1) −v ; Perm[(n−1)2 ,n−2] varDiag[4, 3] 0.24 A more extreme example is given by the food web example (Regan and Wade, 1996), which has a diagonal block with n = 107 and n0 = 10260. Here the expected number of regular errors is computed below, followed by the corresponding variance. meanDiag[10260, 107] //N 0.00492393 varDiag[10260, 107] 0.00492254 The observed number of regular errors for this block is 26, so the number of standard deviations past the mean is astronomically far away from being regular. 26√−meanDiag[10260,107] varDiag[10260,107] 370.507 In order to compute the probability of exactly k 0-vectors for diagonal blocks, we have to modify the max0s function to get the diagonal version, max0s Diag. These modifications involve omitting the diagonal and considering various special cases. max0sDiag[z ,n ] "  $ # 2 2 := Module root1s = If n − n < z,0, (n − 1) − z , If[EvenQ[n], If[z ≤ 3n2 /4 − n, Floor[n − 1 − root1s] + Round[n − 1 − root1s], n + max0s[z − 3n2 /4 + n,n/2,n/2]], If[z < (3n2 − 4n + 1)/4, Floor[n − 1 − root1s] + Round[n − 1 − root1s], n + max0s[z − (3n2 − 4n + 1)/4,(n + 1)/2,(n − 1)/2] ]

]

]

116

J.P. Boyd, K.J. Jonas / Social Networks 23 (2001) 87–123

Fig. 3. The maximum possible number of zero-vectors for a 7 × 7 diagonal block, as a function of the number z of zeros in the block.

An example for a square (7 × 7) matrix with z ranging from 0 to 42 is shown in Fig. 3. The pExact function can now be modified for the diagonal case, as defined below. pExactDiag[k , n0 , n ]:=  max0sDiag[n0,n] i sDiag[i,n0,n]//N (−1)k−i i=k k For the ant1 data, the diagonal block is 7 × 7 with one 0-row and one 0-column and n0 = 22 0’s. The results shown in Table 19 indicate that the observed pair of errors, the 0-column one and the 0-row seven, is extremely high. That is, the anti-regular (or the popularity) hypothesis is again supported. Now let us apply these results to the Sampson Affective (liking) data with the three-way partition proposed by White et al. (1976). The regular-blocks are the three diagonal blocks, of size 7×7×4, respectively. The blocks have n0 = 23, 22, 4 off-diagonal 0’s, respectively. Table 19 Probability of exactly k zero-vectors in a 7 × 7 diagonal block (antDiag) with 22 zeros k

P[k]

0 1 2 3 4 5

0.810555 0.179866 0.00947492 0.000103775 1.14893 × 10−7 0

J.P. Boyd, K.J. Jonas / Social Networks 23 (2001) 87–123

117

Table 20 Probability of exactly k zero-vectors for each of the three diagonal Sampson blocks for the “like” relation k

P[k] :B1

P[k] :B2

P[k] :B3

0 1 2 3 4 5

0.749501 0.231937 0.018216 0.000345 9.6 × 10−7 0

0.810555 0.179866 0.009475 0.000104 1.2 × 10−7 0

0.854545 0.145455 0 0 0 0

There is a single error in the first block, which is not significantly more than expected. In fact, even no errors at all would not be surprising under the null hypothesis. The probabilities for the three blocks are combined in Table 20. We can use also calculate the mean and variances for these blocks. {meanDiag[23, 7], meanDiag[22, 7], meanDiag[4, 4]} //N {0.269408, 0.199128, 0.145455} {varDiag[23, 7], varDiag[22, 7], varDiag[4, 4]} {0.235343, 0.17905, 0.124298} Since these blocks are independent (because we are conditioning on the number of 0’s found in each block), we can find the mean and variance of their sum by adding. This results in a mean and variance of 0.613991 and 0.53869, respectively, for the sum of regular errors. The z-score for the one observed error is thus, 0.525931, which is support for neither regularity or its opposite. The next sub-section will confirm this result by convolution of the probability distributions. Moving on to the Countries data, we can compute the probability of exactly k errors in the 7 × 7 diagonal block as follows: b12Diag = Table[pExactDiag[k, 8, 7], {k, 0, 1}]; TableForm[Table[{k − 1,b12Diag[[k]]}, {k,Length[b12Diag]}], TableHeadings -> {None, {“k”, “P[k] ”}}] k 0 1

P[k] 0.999925 0.0000747266

There are no errors in the 7×7 diagonal block, b12Diag, but since there are only eight 0’s to begin with, this would be expected with probability 0.999925, while having exactly one error has probability 0.000075. Although it is interesting to compare these individual block probabilities with the observed number of errors, what is really needed is the probability of the total number of errors over the whole matrix or matrices. This is the purpose of the next sub-section.

118

J.P. Boyd, K.J. Jonas / Social Networks 23 (2001) 87–123

7.5. Block convolution More generally, if there is more than one regular-block (diagonal or not), then we assume that the distribution of 0-vectors between blocks are independent events. That is, if fi [z] is the probability distribution Zi of the ith regular-block having z 0-vectors, then the probability distribution of the sum Z = Z1 + Z2 is given by the convolution of the two distributions fz (z) = f1 ∗ f2 (z) ≡

z 

f1 (k)f2 (z − k).

k=0

Thus, the probability of having exactly z 0-0-vectors in the two blocks is given by the function probSum. ProbSum[x List, y List, k Integer]:= Module[{imin = Max[1, k − Length[y] + 2], imax = Min [k +1, Length[x]]}, imax 

x[[i]]y[[−i + k + 2]] i=imin ] We can now compute the convolution of the two “regular” blocks in the ant1 data. For example, the interpretation of the second line of the output is that the probability of there being exactly four errors, which is in fact what was observed, in any combination from the two blocks is 0.0360354, as shown in Table 21. Similarly, the three regular Sampson “like” blocks can be combined, as seen in Table 22. The conclusion here is that, neither of the Mathematica operator FoldList can be used to find the cumulative distribution function for all the data sets. Next, the DeleteCases command deletes the redundant 1’s at the end of the cdf, while Delete[-,1] deletes the zero Table 21 Probability of exactly k zero-vectors in the two regular blocks (ant1 data) k

P[k]

0 1 2 3 4 5 6 7 8 9 10 11 12 13

0.131901 0.34835 0.32936 0.149091 0.0360354 0.00487454 0.000371388 0.0000155184 3.33034 × 10−7 3.28411 × 10−9 1.24486 × 10−11 1.21273 × 10−14 1.38287 × 10−18 0

J.P. Boyd, K.J. Jonas / Social Networks 23 (2001) 87–123

119

Table 22 Probability of exactly k zero-vectors in the three regular blocks (Sampson “like” data) k

P[k]

0 1 2 3 4 5 6 7 8 9 10

0.519147 0.364219 0.101289 0.014232 0.00107011 0.0000423475 8.20155 × 10−7 6.96509 × 10−9 2.0424 × 10−11 1.61186 × 10−14 0

in the first entry. The final result is shown in Table 23.Here again there is strong evidence against regular equivalence, conditioned on the distribution of 0’s and 1’s, where one is defined as “positive”. There are four errors, two in each block. This table shows that the probability of getting at least four errors is 0.994738. Conversely, the probability of having four or more errors is 0.005262. Granted, half of the errors are in the diagonal block, which is a perfect linear order, which appears to be the most likely hypothesis for that block. For the Countries data, there are four blocks to be combined. Since probSum is defined as a binary operator, we have to combine blocks two at a time and then combine the results. It makes no difference how this is done because the convolution of random variables is commutative and associative. b12First2Sum = Table[probSum[b1offD21,b12Diag,k], {k,0,3}] {0.999904,0.0000962863,1.61132 × 10−9 ,0} b12Last2sum = Table[probSum[b12offD23,b12offD24,k]{k,0,6}] {0.482429,0.414974,0.0956045,0.00688396,0.000108582,3.22574× 10−7 ,0} b12Sum = Table[probSum[b12First2Sum,b12Last2Sum,k]{k,0,8}] {0.482383,0.41498,0.0956353,0.00689251,0.00010234,3.33009 × 10−7 ,3.12344 × 10−11 ,5.19771 × 10−16 ,0} Table 23 The cdf for the number k of zero-vectors for the ant1 regular blocks k

Fk

0 1 2 3 4 5 6 7

0.131901 0.480252 0.809612 0.958703 0.994738 0.999613 0.999984 1

120

J.P. Boyd, K.J. Jonas / Social Networks 23 (2001) 87–123

Table 24 The cdf for the four regular blocks in the Countries data k

Fk

0 1 2 3 4

0.482383 0897363 0.992998 0.99989 1

Finally, the cumulative distribution function for the Countries data is shown in Table 24. There are only two errors, both in the last block, b12OffD24, of the Countries data. Since three out of four are regular, this would seem to be overall support for a regular trend. Again our intuition fails us. The probability of getting two or more errors is only 1 − 0.99989 = 0.00011.

8. Results and discussion The probabilities of a given number of zero-vectors, interpreted as “errors” in a regularblock, were found by exact combinatorial analysis. The results agree with those obtained by permutation tests. This is hardly surprising, since they are based on the same probability model, the Fermi–Dirac statistics (since the number of 0’s is held constant, and the 0’s are sampled without replacement) (Feller, 1957; Parzen, 1960). However, these two approaches are a valuable cross-check on each other, reducing the likelihood of programming errors in the permutation test and mathematical errors in the calculations of the exact probabilities. The limitation of the exact methods employed here is that they are not as easy as the permutation approach to apply to models where the number of zeros in each row (or in each column) within a block is held constant. Other things being equal, however, the nod always has to go to an elegant analytic solution over lengthy Monte Carlo simulations. Perhaps the most useful results were in Section 7, where formulas were derived for the mean and variance for both diagonal and off-diagonal blocks. These can be added up over all regular-blocks to give the overall mean and variance for the number of observed errors. This can be used to compute z-scores. Furthermore, the process of addition makes the normal approximation much better, so p-values from the normal distribution can be used for larger data sets. Both the exact and the permutation tests were applied to the four data sets; Sampson Monastery, the Countries Trade Network, the Krackhardt High-Tech Managers data, and the Dominance Hierarchies in Leptothorax ants. In all four cases, the 0-blocks were very significant, having not a single instance out of 10,000 permutations with fewer errors than were found in the data. With the regular-blocks, however, there was no overall significance in any of the four cases; in fact, there was, if anything, a trend toward having more errors in the data than in the permuted matrices! What is really going on in the data presented in this paper is that we are finding a tendency away from, rather than toward, regularity, when the density of non-zero ele-

J.P. Boyd, K.J. Jonas / Social Networks 23 (2001) 87–123

121

ments is held constant. What shall we call this observed tendency? First notice that for a block to be regular implies a constraint on the variance of the block marginals: since they all have to be at least 1, the variance possibilities are reduced. The opposite of a regular-block, then, is one in which the marginals have higher than average variance. This is because the presence of more 0-vectors than expected means that the other marginals have to be larger than average to make up the difference, resulting, under most conditions, in a larger than average variance in the marginals. For the columns, this is better known as a “popularity bias”; for the rows, it is known as a “response bias”. This is probably the norm in most naturally occurring sociological, economic, political, or biological situations. The substantive implication is that none of the “classic” examples of regular equivalence hold up to this model. This is despite the fact that some of these blockmodels were formed explicitly to illustrate regular equivalence. If this fact were taken into account, the trend away from regularity would be even more pronounced. The problem is that, absent exact probability calculations, there has been nothing on which to calibrate our intuitions. The result is that the innate human tendency to find structure where there is none has taken over. Just as the gambler believes that a run of successes indicates “luck” or other intervention, so have apparently “regular” blocks seduced us all into thinking that there was a real social process going on. However, just as experience in looking at scatter-plots with computed correlations has enabled social scientists to be able to give surprisingly good “eye-ball” estimates of correlation coefficients, so too many tools in this paper enable us to better judge any real tendency toward regularity. If confirmed by other authors on other data sets, however, the implications of this paper are clear: regular equivalence as a default model of social interaction must be abandoned. Nevertheless, we remain convinced that nature must hold at least some examples of regular-blocks. Where can we find them? Certainly in situations where an “exclusion principle” holds, we can find regularity. One such exclusion principle is heterosexual marriage. If we were to look at marriage patterns in an endogamous village allowing no homosexual marriage (although polygamy would be alright), we could block the married people according to sex and find perfect, but trivial, regularity. This regularity is trivial, because it is regular by definition, instead of as an empirical finding. More generally, anthropological moiety and marriage-class systems would fit this model, where the society would be partitioned by sex and marriage-class, and where the relation might also include “mother” and “father”. However, as soon as kinship relations are replaced by less restricted relations, such as “is attracted to”, then the results of this paper suggest that the anti-regular, i.e. popularity, bias would predominate. A less obvious example of an equivalence that is regular by definition is the “reports to” relation in a formal hierarchy. The usual cultural requirement is that everyone, with the exception of the top CEO, reports to exactly one superior. Furthermore, this relation is usually thought of as acyclic. These two requirements generate a “rooted” (or oriented) tree. Rooted trees that have a large number of nodes compared to their height have a maximal regular equivalence with a relatively few numbers of classes. For example, a tree that has a maximum distance of three from “leaves” to the root, has at most 11 maximal regular equivalence classes (including the root). So we find again that the presence of a regular equivalence is an inevitable result of specific cultural rules.

122

J.P. Boyd, K.J. Jonas / Social Networks 23 (2001) 87–123

In order to find regularity in more fluid modern societies, we would have to look for situations where the utility of the first few ties with members from a given block is great, but where the marginal utility falls off rapidly, or even becomes negative, as more ties are added. Let us reconsider the hypothetical example of doctors and patients: most doctors want a certain number of patients to support their practice, but too many patients would overload the doctors and their staffs. Similarly, a patient would not want to go to a doctor with too big a practice, since the delays, expenses (driven by supply and demand), and poor service would make a switch desirable. In the data matrices considered here, however, there is really little cost involved in having too many monks who like you, too many managers who seek occasional advice from you, too many countries that trade with yours, or having too many other ants dominating the same ant. In addition to the empirical failure of regular equivalence discovered in this paper, the very concept of regular equivalence also violates one of the philosophical bases of the field of social networks. The essence of social networks is the study of local interactions among individuals. Sometimes these local interactions can produce global, or emergent, structures, and sometimes not. In either case, however, social network studies are closely tied to these local interactions among a small number of individuals. Regular equivalence, on the other hand, violates the spirit of this inquiry in that its definition logically depends on a global property, viz. the partition. A partition generates a set of equivalence classes, which is the kind of group-level concept that social networks tries to avoid as a primitive concept. Note that this criticism does not apply to structural equivalence, which is defined in terms of interactions around specific individuals. It does not even apply to structural equivalence linked with “functorial reductions” (White et al., 1976) or the “semi-group equations” of Boyd (1991), because each such equation is defined on triads of individuals. However, other kinds of social equivalence, such as automorphic equivalence (Boyd and Everett, 1988), also violate the principle of local interaction. It is straightforward to design combinatorial optimization programs based on greedy algorithms (Batagelj et al., 1992), simulated annealing (Boyd, 1991), or more sophisticated concepts (semi-definite programming: see Karger et al., 1998) to find the best regular equivalence, according to whatever criteria are specified. In this case, however, the statistical tests suggested here no longer apply directly, since they compare an observed block with others chosen at random. Any algorithm that searches for the best regular equivalence may find one, even on random data. This is analogous to the situation with clustering algorithms, as can be seen from Boch’s (1996) survey of simulation, resampling, and exact statistical tests for clustering models. Boyd (2000) has addressed this problem of taking account of the effect of a search algorithm on the statistics of regular equivalence. He designed a relatively fast algorithm based on variable-depth local search that could be used with a permutation test, i.e. the (locally) optimal regular equivalence based on the relational data is compared with the same algorithm applied to permutations of the data. This new approach also finds no evidence for regular equivalence. If regular equivalence fails empirically, then what is to replace it? One viable suggestion is a return to classical clustering methods, For example, one can maximize between block variance.There are now methods (Boch, 1996) for finding and statistically evaluating such models. Another suggestion is to search specifically for one-blocks with higher than

J.P. Boyd, K.J. Jonas / Social Networks 23 (2001) 87–123

123

expected marginal variance, i.e. the popularity effect, which destroys regularity, should be taken as the default hypothesis. Algebraically, this is best modeled by the “central representatives condition” (Pattison, 1982; Kim and Roush, 1984), which suggests that the relations within and between roles are mediated by a small set of star players. We should not let the beauty of regular equivalence as a mathematical concept blind us either to its empirical shortcomings or to more valid alternatives. References Batagelj, V., 1994. Semi-rings for social networks analysis. Journal of Mathematical Sociology 19, 53–68. Batagelj, V., Doreian, P., Ferligoj, A., 1992. An optimizational [sic] approach to regular equivalence. Social Networks 14, 121–135. Boch, H.H., 1996. Probability models and hypothes testing in partitioning cluster analysis. In: Arabie, P., Hubert, L.J., De Soete, G. (Eds.), Clustering and Classification. World Scientific Publishing, River Edge, NJ. Boyd, J.P., 1991. Social Semi-groups. George Mason University Press, Fairfax, VA. Boyd, J.P., 2000. Finding and testing regular equivalence. In: Proceedings of the Twentieth Annual International Sunbelt Social Network Conference, Vancouver, April 2000. Linton Freeman Festschrift, submitted for publication. Boyd, J.P., Everett, M.G., 1988. Block structures of automorphism groups of social relations. Social Networks 10, 137–156. Boyd, J.P., Everett, M.G., 1999. Relations, residuals, regular interiors, and relative regular equivalence. Social Networks 21, 147–165. Cole, B., 1981. Dominance hierarchies in Leptothorax ants. Science 212, 83–84. Everett, M.G., Borgatti, S.P., 1993. An extension of regular colouring of graphs to digraphs, networks and hypergraphs. Social Networks 15, 237–254. Feller, W., 1957. An Introduction to Probability Theory and its Applications, Vol. 1, 2nd Edition. Wiley, New York. Good, P., 1994. Permutation Tests. Springer, New York. Hubert, L.J., 1987. Assignment Methods in Combinatorial Data Analysis. Marcel Dekker, New York. Karger, D., Motwani, R., Sudan, M., 1998. Approximate graph coloring by semi-definite programming. Journal of the ACM 45, 246–265. Kim, K.H., Roush, F.W., 1984. Group relationships and homomorphisms of Boolean matrix semi-groups. Journal of Mathematical Psychology 28, 448–452. Krackhardt, D., 1987. Cognitive social structures. Social Networks 9, 109–134. Mantel, N., 1967. The detection of disease clustering and a generalized regression approach. Cancer Research 27, 209–220. Parzen, E., 1960. Modern Probability Theory and Its Applications. Wiley, New York. Pattison, P.E., 1982. The analysis of semi-groups of multirelational systems. Journal of Mathematical Psychology 25, 87–117. Pattison, P.E., 1993. Algebraic Models for Social Networks. Cambridge University Press, Cambridge, MA. Regan, D.P., Wade, R.B., 1996. The food web of a tropical rainforest. University of Chicago Press, Chicago. Reitz, K.P., White, D.R., 1989. Rethinking the role concept: homomorphisms on social networks. In: Freeman, L.C., White, D.R., Romney, A.K. (Eds.), Research Methods in Social Network Analysis. George Mason University Press, Fairfax, VA. Sailer, L., 1978. Structural equivalence: meaning and definition, computation and application. Social Networks 1, 73–90. Sampson, S.E., 1969. Crisis in a Cloister. University Microfilms No. 69-5775, Ann Arbor, MI. Wasserman, S., Faust, K., 1994. Social Network Analysis. Cambridge University Press, Cambridge, MA. White, D.R., Reitz, K.P., 1983. Graph and semi-group homomorphisms on networks and relations. Social Networks 5, 143–234. White, H.C., Boorman, S., Breiger, R., 1976. Social structure from multiple networks. I. Blockmodels of roles and positions. American Journal of Sociology 81, 730–780.