JOURNAL
OF MATHEMATICAL
PSYCHOLOGY
Constructing
17, 21-63
Blockmodels: PHIPPS
Department
of Psychology,
(1978)
University
How and Why
ARABIE
of Mkesota,
Minneapolis,
Minnesota
55455
AND SCOTT Department
A. BOORMANAND
of Sociology,
Yale
University,
PAUL
R.
New
Haven,
LEVITT Connecticut
06520
Blockmodel approaches to network analysis as developed by Harrison White are shown to fall in a broader class of established data analysis methods based on matrix permutations (e.g., clique detection, seriation, permutation algorithms for sparse matrices). Blockmodels are seen as an important generalization of these earlier methods since they permit the data to characterize their own structure, instead of seeking to manifest some preconceived structure which is imposed by the investigator (e.g., cliques, hierarchies, or structural balance). General algorithms for the inductive construction of blockmodels thus occupy a central position in the development of the area. We discuss theoretical and practical aspects of the blockmodel search procedure which has been most widely used (CONCOR algorithm). It is proposed that the distinctive and advantageous feature of CONCOR is that it solves what is initially presented as a combinatorial problem (permutations of matrices to reveal zeroblocks) by representing the problem as a continuous one (analysis of correlation matrices). When this representation strategy receives further development, it is predicted that the fairly crude empirical approach of CONCOR will be supplanted by more powerful procedures within this same class.
The idea of permuting the rows and columns of a matrix to reveal a desired pattern is having an increasinginfluence in applied mathematics(seeJennings, 1968; Tewarson, 1973; Duff, 1977; also, papers in volumes edited by Reid, 1971; Roseand Willoughby, 1972; and Barker, 1977). The classicalapplications arise mainly in connection with the method of Gaussian elimination for the solution of large simultaneouslinear systems (Wilkinson, 1965; Bunch and Rose, 1976). The objective in this instanceis to rearrange rows and columns so asto achieve one of a variety of target patterns which are known to speedcomputation greatly in the courseof inverting the coefficient matrix (e.g., Steward, 1965; Bunch, 1973; Lin & Mah, 1977; Alvarado, 1977). The sameidea of searchingfor structure by means of an imposed permutation also carries over into decomposition theories in input-output and related systems(Weil, 1968; Weil & Kettler, 1971) and from there into mathematical economics and econometrics more generally [e.g., the Ando-Fisher-Simon (1963) theorems on nearly decomposablesystems]. In the behavioral sciences,a different application is made of row-column permutations for the purpose of revealing hidden structure in a matrix. At least ostensibly, such a 21 0022-2496/78/0171-0021$02.00/0 All
Copyright Q 1978 rights of reproduction
by Academic in any form
Press, Inc. reserved.
22
ARABIE,
BOORhUN,
AND
LEVITT
permutation rendering order from chaos constitutes a parameter-free model. Such a model has obviously desirable features under most conventional views of scientific inference which stress parsimony and explanatory power. As soon as any measure of goodness-of-fit is introduced, this apparent advantage often tends to become somewhat blurred, especially if the theory behind the measure is that the data coded by the matrix are sampled from some continuous underlying distribution. Despite such ambiguities, it is intuitively convincing that row-column permutations of a matrix leave the raw data far more chaste than do data analysis techniques requiring a priori replacement or aggregation, e.g., taking ranks, or replacing subsets of the data by various summary statistics (AczCl & Daroczy, 1975; Theil, 1967). For this reason, permutation methods are an important member of the small but growing family of data analysis methods following the philosophy that aggregation is to be inferred at the end of the analysis, not imposed at the beginning (compare critical observations by Batchelder and Narens [1977, pp. 118-l 191 directed against naive formation of contingency tables from underlying dichotomous data).’ This paper develops the thesis that matrix permutation methods can unify a number of classes of data analysis problems of broad interest to behavioral scientists, including clique detection,seriation, and empirical versions of balance theory, as well as more general procedures for network analysis and graph decomposition. The general technical unification centers around the concept of a blockmodel (Definition 1 below), introduced by Harrison C. White and co-workers (White, Boorman, & Breiger, 1976), together with the use of hierarchical clustering methods for obtaining blockmodel structure from sparse matrices (CONCOR algorithm and related algorithms). As we will see, there are numerous special cases for which the idea of a blockmodel is not new, and it is possible to trace the lineage of the concept to the 1890s (e.g., for seriation, Petrie, 1899). However, the natural generality of permutation methods as a means of identifying structure has only recently begun to appear, principally through White’s work in application to social networks.
1. PURPOSES
OF
PERMUTING
A MATRIX
A. CliqueDetection If a matrix is to be profitably rearranged, some explicit criterion should be advanced as to the result desired. Both becauseof its simplicity and for substantive reasons,the first type of target generally consideredby sociologistswas one to which mathematicians and economistshave also paid considerableattention, namely the block diagonal form for a squarematrix of dimensionn:
1CompareLeijonhufvud
(1968, Chap. 3) for compatible ideas in a macroeconomic
context.
BLOCKMODELS
23
Here each of the Ci is a square nonzero submatrix and all entries off the block diagonal are zero. In conventional sociometry, the diagonal entries Ci are usually interpreted as cliques whose membership is determined by the corresponding rows and columns. In one of the earliest rigorous contributions to the clique detection literature (Katz, 1947, p. 235), the search for a pattern approximating block diagonal form is cast as a formal minimization problem with a prescribed objective function. This early reliance on an explicit target function anticipated a trend later to become widely prevalent in cluster analysis (Ward, 1963; Hubert, 1973; Carroll 8i Pruzansky, 1975) and sparse matrix methods (Ruhe, 1977, p. 136), as well as in scaling (Kruskal, 1964a). Later sociometric investigators drifted away from Katz’s conception and a wide variety of competing algorithms made their appearance, although these have often been applied only illustratively if at all (see, e.g., Beum & Brundage, 1950, or Coleman & RlacRae, 1960; more recent literature has been reviewed by Arabie, 1977). However, as a constellation of methods for the empiricai investigation of concrete social structure, virtually all of the huge follow-up literature on clique detection remains impoverished in two critical aspects. First, attention is very rarely paid to the need for a simultaneous analysis of multiple networks in the same population, even though the multiplicity of relations operating in actual social structures is obvious (see Leik & Meeker, 1975, pp. 95-97). Second, the block diagonal pattern by definition excludes asymmetric relations between cliques, SO that contrast effects (e.g., hierarchy versus alliance) are revealed only in the unsatisfactory capacity of an “error” in the model imposed. For both reasons, clique detection per se is at most a first step toward more comprehensive and versatile analyses. B. Sedation Largely unnoticed by psychologists and sociologists, archaeoIogy has fostered a distinct tradition of data analysis known as seriation, originally traceable to pioneering work of the great Egyptologist Sir W. M. Flinders Petrie (1899). Seriation methods resemble cliquing procedures, but differ from them in replacing the block diagonal target by the band form. For the simplest case of a square matrix, band form is illustrated by the following raw and permuted artificial data (Tewarson, 1973, p. 70):
(raw)
-1101000010 1110000010 0110100010 1001110011 0111100111 0001011001 0000011101 0000101101 1111100011 -0 0 0 1 1 1 1 1 1 1
24
ABABIE,
(permuted to band form)
BOORMAN,
AND
LEVITT
0101101000 0010010110
Here the same permutation is being imposed on both rows and columns, with the presumption being that row and column labels index the same population. For the more general case of a rectangular binary matrix, where rows and columns index different populations (as in an item-by-attribute incidence matrix), the appropriate generalization of the band form pattern is clear enough. In this case, different permutations for rows and columns are required. For the still broader concept of an abundance matrix, which in general may be any nonnegative real-valued matrix of arbitrary specified dimensions, the desired pattern is one where for each row the (conditional) distribution of entries is unimodal across the given ordering of columns. If the matrix is square, the maximum value must be attained on the principal diagonal; otherwise, the location of this value should fall as close to the diagonal as possible. Kendall (1970, 1971) gives a general formalization. From the standpoint of theory, there is one important feature of band form patterns which is not shared by block diagonal forms: this advantage is the implied one-dimensional ordering of rows and columns, or of both together if the matrix is a square one with rows and columns indexing the same population. Thus in the case considered by Petrie, the raw data formed an incidence matrix for 800 varieties of objects found in 900 prehistoric Egyptian graves, and by rearranging this matrix he sought to reconstruct the chronology of the graves. For applications to social networks, the utility of the same idea in searching for social hierarchies is apparent, even if thus far little exploited in the & empirical literature. For example, an especially promising method (Kupershtokh Mirkin, 1971) of seriating a conditional proximities matrix (e.g., a sociomatrix where different actors’ reported choices are not assumed to be directly comparable) has yet to be utilized. The possibility of seriating sociometric data manifestly accentuates the limitations of algorithms designed solely for cliquing or otherwise partitioning the actors, as tools for revealing hierarchy or more general patterns of stratified social organization (see also Bernard, 1974). Many algorithms have been proposed for achieving explicit solutions to the seriation problem (Deutsch & Martin, 1971; Gelfand, 1971; Landau & de la Vega, 1971; Hubert, 1973, 1976). Presaging the general utility of continuous methods in solving ostensibly discrete permutation problems, the most successful approach currently seems to be through application of nonmetric multidimensional scaling. First implemented by
BLOCKMODELS
25
Kendall (1969, 1970), this approach requires the preliminary conversion of a given m x n abundance matrix A into a symmetric matrix (e.g., by multiplying with the transposeto obtain AAt) and then entering the symmetrized matrix as input to one of the nonmetric multidimensionalscalingalgorithms (Shepard, 1962a,b; Kruskal, 1964a,b) to obtain a Euclidean solution in two dimensions.In the most successfulapplications (Kendall, 1971), the desired seriation may be recovered from a “horseshoe-shaped” scaling solution, suggestiveof a Hamiltonian arc passingexactly once through all points of a graph (Wilkinson, 1971). There are two reasonsfor embedding the problem in two dimensions,even though a one-dimensionalsolution alone is desired. It is instructive to review them, since similar considerationsoften apply to motivate use of continuous methods in other combinatorial problems. First, the widespreadlocal minimum problem for nonmetric scalingalgorithms (Arabie, 1973) proves to be exceptionally serious when computation is proceeding in a one-dimensional space (Shepard, 1974). As a result, the extra degrees of freedom allowed by a second dimension may be exploited to obtain a better unidimensional solution. Second, there is the problem of obtaining a viable approximate or near solution, where an exact or true solution is not possible. As Kendall has emphasized, many abundance matrices can be only approximately seriated, if at all. By obtaining a twodimensional solution, the investigator can see whether the results in fact suggest a unidimensional pattern, or can alternatively reject this hypothesis altogether if no such pattern presentsitself (Kruskal [I9721 for proposalsasto how such a hypothesis may be tested formally). The strategy of deliberately obtaining a one-dimensionalrepresentation from a twodimensionalplot is somewhatamusingly at oddswith the tradition (frequently encountered in psychological studies) of mistaking a distorted line for a configuration or manifold of higher dimension [for example, seeShepard’s (1974, pp. 387-388) well-directed critique of the analysisoffered by Levelt, Van de Geer, & Plomp, 19661.Thus, it would seemthat psychologistsas well associologistscould benefit from enhancedcontact with alternative quantitative traditions in other behavioral sciences.Recently, psychologists have paid increasingattention to the problem of seriation (e.g., Ducamp & Falmagne, 1969; Coombs & Smith, 1973; Hubert, 1974a, b, d, 1976; Shepard, 1974). A promising direction for future innovations would seemto be in the application of the Shepard and Carroll (1966) parametric mapping algorithm to the seriation of matrices. This procedure, which is also a continuous method, should ultimately allow far greater generality in seriation than Kruskal’s (1964a, b) MDSCAL, although the facilities required to implement the parametric mapping program are still not widely available. C. Other
Tafget Conjigwations
There are, of course, many intuitively interesting and practically relevant target configurations for a permuted matrix other than block or band diagonal form. The block triangular form (Tewarson, 1973, pp. 50ff.) is widely encountered wherever relations among variables are hierarchically structured; examples of this and similar forms frequently occur in sociometric data obeying generalizations of the balance concept that
26
ARABIE,
BOORMAN,
AND
LEVITT
combine hierarchy and cliquing ideas.2 A number of more complex targets, obtained by combining and weakening the restrictive features of block diagonal, band, and block triangular patterns in various ways, have also been considered by numerical analysts (Tewarson, 1971; 1973, p. 75). A s in each of the cases already considered, such targets are characterized with reference to the distribution of zero submatricesand half-matrices, e.g.:
where each single entry now refers to an entire blocked submatrix of the partitioned data. These particular configurations are referred to respectively by Tewarson (1973) as SBBDF (singly bordered block diagonal form) and BBTF (bordered block triangular form); of course, the sizes shown (4 x 4) are purely illustrative and arbitrary. For configurations like these to be useful, the data are required to be quite sparse, with O’s constituting an appreciable percentage of the entries. (It is no accident that much of the standard mathematical literature on matrix permutations grows out of the theory of sparse matrices.) Since low density is an observed characteristic of virtually all sociometric data and many other types of social network data as well, the relative sparsity of the underlying network may be treated as a valid background assumptionfor the purpose of applying blockmodel methods to data of this type (contrast the distinctly nonsparsesystemsconsidered in various economic problems, e.g., Scarf, 1967; seealso Porsching, 1976). For the moment, we call attention to three limitations in the classof data representation strategies considered thus far. First, whenever permutations are considered only with respect to targets of someone fixed type, or classof types (e.g., block diagonal, band, or block triangular forms), the analysis starts with strong assumptionsas to the type of structure which the data are allowed to reveal. In certain cases,the investigator may have good a priori intuition regarding the data and strong assumptionsas to the likely pattern may then be justified. Much more frequently, however, and particularly in the analysis of complex social structures, there is no reliable guide to the probable structure of a complex network; cultural prescriptions and common sensemay provide some clues, but more often than not they are likely to be misleadingrather than informative (White, 1963; 1972). Therefore, the first limitation of any imposed target configuration is that the target presumesknowledge which the researcherprobably doesnot have and cannot acquire prior to successfultesting of a hypothesis. Even for simple networks, there is an astronomicalnumber of alternative hypotheses(seeLorrain, 1973for 2 x 2 x 2 patterns only) and a blind enumerative search seemsscarcely satisfactory. I The classical useof blocktriangularnetworksin socialsciencearisesin the so-calledAustrian capital theories(Dorfman, Samuelson, & Solow, 1957,pp. 205, 234), which postulatea natural stratificationof industrieswith industriesin stratumi deriving factor input from industriesin the next mostprimary stratum(; - 1). Thesetheoriesfit datapoorly, at leastwhen appliednaively to input-output tableaus.
27
BLOCKMODELS
A second limitation concerns the ad hoc nature of the algorithms used for testing different hypotheses. For example, cliquing algorithms as a family are distinct from seriation methods, and the latter in turn from the large number of distinct particular procedures which sparse matrix theorists have advanced for seeking more complex targets (Tewarson, 1973, pp. W-81). A s 1ong as a distinct method is required for seeking each type of target, it is clearly not feasibleto investigate more than a very limited set of types, and even here it is often not possibleto say with confidence whether a failure to reveal structure of a given form is a feature of the data, or insteada failure in the algorithm employed. Thirdly, none of the proceduresdescribedthus far is amenableto direct extension to the simultaneous analysis of multiple networks, despite the presenceof multiple kinds of simultaneousrelations in even simplesocialstructures (Lorrain & White, 1971; Boissevain, 1974; Breiger, 1976). As long as one adopts the basic standpoint of permutation methods, it should be feasibleto develop a classof proceduresfor permuting multiple data matrices in a search for patterns which are simultaneously interpretable acrossall components. This philosophy is clearly stated in Lorrain’s treatise on social networks (1975). In the broader setting of data analysistechniques for scaling and conjoint measurement,Carroll and \Yish (1974) have placed similar emphasison the advantagesinhering in three-way as opposed to two-way data analysis procedures. However, none of the above literature presents practical algorithmic means of handling the simultaneous network analysis problem even for small networks (and we may note in passingthat the popular method of Guttman scaling [see McConaghy, 19751similarly falls short in this respect). We now turn to a classof algorithms designedto handle the three objections we have raised.
2. BACKGROUND
OF THE
BLOCKMODEL
APPROACH
A. Antecedents Perhaps the earliest published representation of structure in data which involves the essentialideasof blockmodeling is that given by Lambert and Williams (1962, especially pp. 784-785).3 Their data set comprised an items (species)by attributes (sites) matrix having binary entries denoting the presenceor absenceof a particular speciesat a given 3 One methodologically similar study, antedating Lambert and Williams by many years, is that of Beyle (1931), who applied what he referred to as the method of “attribute-cluster-blocs” to identifying voting factions in the Minnesota State Legislature [this method actually derives from Rice (1927), also a study of voting cliques]. While a large amount of Beyle’s terminology is reminiscent of later blockmodel vocabulary [for example, he distinguishes among “principal blocs,” “inner fringes,” “ outer fringes,” and “lesser bloc systems” (pp. 35, 62-69, 82-8311, it is important to realize that, unlike Lambert and Williams, his analysis was always primarily directed toward clique detection and immediate generalizations also focusing on patterns along the main (blocked) diagonal. We are indebted to William Panning for calling this work to our attention (unpublished memo, School of Public and Urban Policy, University of Pennsylvania), along with the large subsequent literature in the quantitative study of legislative behavior.
Two-Way
Incidence
TABLE 1 of 76 Vascular
Matrix 1
Species
- A
B
C
Site
1 4 3 2 10 8
Calluna vulgaris Erica cinerea E. tetralix Molinia caersdea Polygala serpyllifolia Ulex minor
23 18 15 27 32 6
Drosera intermedia D. rotundifolia Eriophorum angustifolium Juncus squarrosus Pinus sylvestris Trichophorum caespitosum
9 36 21 11 5 43
Agrostis setacea Carex pilulifera Festuca ovina Potentilla erecta Pteridium aquilinum Quercus robur
2
-
-1 _3
Plants I
8 13 1920 1 ’7 101117
1 1 1 1 1 1 1 11 1
1 1 1 1 1 1 1 1 1 I 1 I
--
I I 1I II
_-
1 1 I 1 1 1
1 1 1 1 1 1
by 20 Sites5 3
9 12 14 I 1 4 5 6 151618 -1I 1 1 1 1111111 1 1 1 1 Ii 1 1 1 1111111 1 1 1 1 1111111 1 1 1 1 I 1 1 1 1
--
Il
1 1I 1 1I 1 II
1 1I
1 1I 1
1 1
1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 i
I I I I I I
4
-
? 1!
1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1
1 1 1 1 1 1
--
11
1 11 1 1
--
D
E
14 19 I7 13
Carex panicea Juncus acuti#orus Narthecium osssfragum Pedicularis sylvatica
56 58 44 57 12 61 46 35 37 41 38 50 48 54 55 52 45 39 34 49 53
Anemone nemorosa Anthoxanthum odoratum Betula verrucosa Campanula rotundifolia Carex binervis Castanea sativa Cerastium vulgatum Galium hercynicum Hieracium pilosella Hypericum hums&.sum Hypochaeris radicata Lathyrus montanus Lonicera periclymenum Lotus corniculatus L. uhginosus Luzula multiflora Orchis ? ericetorum Sieglingia decumbens Succisa pratensis Teucrium scorodonia Veronica chamaedrys
F - -
33 o:;ler D Adapted
from
Lambert
species
1 1
1
1
1
1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1
-
and Williams
--
1
1 1
1 1 1 1 1
1
1 1 11
1
(1962,
p. 284).
29
BLOCKMODELS
TABLE Density
Matrix
and Histogram
Species
a The discussed
0.73 0.03 0.57 0.25 0.40
2 for Block
0.67 0.08 0.75 0.06 0.04
Site 0.50 0.79 0.08 0.44 0
Densities
in Table
1”
0.81 0.17 0.05 0 0.01
matrix shown is a partial one, with 33 species being deleted in the source paper (Lambert & Williams, 1962, p. 786).
from
the
sample
for
reasons
site. The authors sought to divide the species into homogeneous groups by finding one permutation of the species and another permutation of the sites in order to obtain what they called the “logical collation” of homogenous blocks of zeros and ones. In ecology, this problem is related to the classical issue of defining ecological niches (Slobodkin, 1961) and the Lambert-Williams solution may be interpreted as one procedure for operationalizing a niche concept. One of Lambert and Williams’ blockmodels is here reproduced in Table 1 (data) and Table 2 (densities). Thus, we have the situation where the goal is to permute an m x n matrix so that rows and columns may be separately partitioned into comparatively homogeneous groups according to boundaries inferred from the distribution of O’s and 1’s. To implement the permutations, Lambert and Williams employed factor analysis first on the columns to obtain a bipartite split of the columns with reference to the site (column) having the highest centroid loading. Each of the disjoint subsets thus obtained was then independently subdivided, etc., in an iterative manner. The same procedure was then separately applied to the rows. In the blocked matrix, notice that the obtained distribution of block densities (displayed directly in Table 2) is a rather clearly bimodal one, thus foreshadowing the preferred pattern in sociometric cases (e.g., Table 3 below). Procedures for implementation alternative to those of Lambert and Williams (1962) are presented later, but the goal of revealing contrast between blocks of low and high density in the blockmodel approach is identical to that of these earlier investigators. The crucial point of departure between the Lambert-Williams procedure and the algorithms discussed previously is that the Lambert-Williams analysis allows the data to determine their own natural pattern, with no need for the investigator to drum up the appropriate target. The essential theoretical rationale of searching for a block pattern like Table 1 had been stated even earlier by Goodall (1953, p. 39) when he observed that:
30
ARABIE,
BOORMAN,
AND
LEVITT
. ..
if data can be so divided that within each subdivision no interspecific correlations occur, these subdivisions represent elementary classification units of the vegetation, and an objective classification may be arrived at if methods can be found satisfying this requirement . . . . The simplest and apparently the most satisfactory Cprocedure] consists in finding the most frequent species showing significant correlations with others, and separating in the first instance all quadrats [columns] containing this species. Within this group correlations are again tested, and the process is repeated until a group with no significant correlations has been extracted. All other quadrats are then lumped together and the whole process is begun again from the beginning. When the whole collection of quadrat data has been divided in this way into homogeneous groups, these groups are recombined where this can be done without the resulting larger group showing interspecific correlations.
Although Goodall’s (1953) p ap er closely presaged the philosophy for seeking a blockmodel, he presented no especially compelling illustrations, probably owing to the lack of available computers at that time. Thus, to our knowledge, Lambert and Williams (1962) were the first to obtain a concrete blockmodel on data. Other roughly contemporary work (e.g., Needham, 1965) suggested similar theoretical approaches, and conceptually related studies began to appear sporadically (e.g., Lingoes & Cooper, 1971; D’Andrade, Quinn, Nerlove, & Romney, 1972, p. 30). In this later work, the most notable contribution is the procedure of McCormick, Schweitzer, & White (1972), which foreshadowed most of the main ideas of a complete blockmodel theory: a general target pattern (called the “block checkerboard form” by these investigators), an explicit target function for producing this pattern by row/column permutations, as well as a practical quadratic assignment algorithm for implementing the desired optimization. This work was subsequently followed up by Lenstra (1974) ( see also Lenstra & Rinnooy Kan, 1975), who related the permutation procedure of McCormick er al. to the traveling-salesman problem in combinatorial theory. 33. Blockmodels The pioneering approach of Lambert and Williams (1962) was directed toward a single m x n binary matrix with the investigator’s discretion over the procedure limited to specification of the fineness of row and column partitions (in other words, how many distinct blocks to allow for each), The present paper is primarily concerned with the approach to blockmodels developed by White and co-workers (White et al., 1976). This work began in the early 1970s without knowledge of the earlier contributions by Lambert and Williams or McCormick et al. Since White’s emphasis was directed toward social structures, his approach is distinct in several substantive ways which will now be considered. The fundamental data structure considered generally took the three-way form of K square nonnegative matrices, each with n rows and n columns denoting the same population of actors. Each of these matrices coded a different type of tie or network relation (e.g., “Alliance,” “Antagonism,” “Contact,” etc.). The matrices were often but not necessarily binary [thus in Sampson’s (1969) monastery data (discussed at length in Breiger et al, 1975) the data reported were ordered choices with an entry of +3 corre-
31
BLOCKMODELS
sponding to top choice, $2 to second choice, and +l to third choice]. The relations were typically irreflexive (as is customary in sociometry) but almost never symmetric or (completely) transitive. Given data of this type, White posed the question of whether a permutation could be identified which, when applied simultaneously to both rows and columns of each of the K matrices, would reveal a substantively interpretable blockmodel for each type of tie. Such a blockmodel would make concrete the concept of structural equivalence, i.e., would locate subsets of actors occupying identical (up to an automorphism) or at least closely similar positions in the social structure (Lorrain, 1975; Lorrain & White, 1971). As an example of a blockmodel permitting such an interpretation, consider the following blocked matrices, respectively coding positive and negative sentiment in a population of seven individuals:
P 5
0001000 0010000 0100000 1 1 0 o/o
0 0
,
N=
0000011 0000 100 0 0 0 010 1 0
-0 0 0 0 0000000 0000 0000010 1 0 0 l/O 0110000 1100000
1 0
1.
110
0 0
In the permutation shown and with the indicated bipartition, the blocked diagonal of the N matrix consistsonly of zeroblocks, asdoesthe blocked off-diagonal of the P matrix. A compact coding of the aggregatestructure is hence furnished by the image matrices
P
=o I
1 0 11
N=[;
($9
where a “0” in the image correspondsto a zeroblock in the data matrix and a “1” corresponds to a block containing at least some 1’s. By convention, we will retain the term “block” to refer both to blocked submatricesof the raw data matrices(the 7 x 7 matrices in this example) and to the subsets of the actors defined by the partition (see also Appendix).4 Observe that the ordering of blocks is arbitrary, as is the ordering of actors within blocks. Formally, DEFINITION
1.
Let an n
x
n
x
K binary array A be given. A blockmodel B of A is
4 In a model with R ceils in the row partition and C cells in the column partition, the blockmodel determines RC submatrices. When rows and columns index the same population, R = C and there are C* submatrices. When we speak of a 2-block, 3-block, etc. model we are always referring to C, so that a 2-block model has four blocked submatrices, etc. 48011711-3
32
ARABIE, BOOBMAN, AND LEVITT
an m x m x k binary array, m < n, together with an onto mapping+: {I, 2,..., n>-+ (1, z..., m} for which
~(i, j, q) =
U U 4, eelf.-qr)&d-‘(i)
d, 4
for all 1 Q i, j < m, 1 < 4 < k, where U refers to Boolean sum. Then the image matrices are the matrices corresponding to B(i, j, q) for fixed q, q = 1, 2,..., k; the dimension of these matrices equals the number of blocks m (in the example just given, m = 2). Definition 1 is formally equivalent to the definition of a graph homomorphism in the category of multigraphs (e.g., Lorrain, 1975; see also Berge, 1962). It is clear that not all blockmodelswill be equally informative. The 2 x 2 x 2 (P, N) blockmodel just presented should be interpretively familiar, since it is a version of balance theory (Abelson & Rosenberg, 1958; Friedell, 1962) where two nonoverlapping cliques direct all positive ties internally and all negative ties externally. Notice, however, that the model does not require cliques to be complete subgraphs (Harary, Norman, & Cartwright, 1965). In fact, for the artificial data presented,none of the nonzero blocks has a density exceeding 70% (calculated to exclude diagonal entries; see Appendix). Thus, as in the caseof the Table 1 blockmodel pattern, the blockmodel is given structure by the absence of ties in certain blocks instead of by the high density of blocks containing ties. Such an emphasison zeroblocks dovetails with, and generalizes,the previous stress on zeroblocks in the various specific target configurations discussedin Section 1. As in those cases,a searchfor revealed zeroblocks is compatible with low density network data. A slight generalization of the zeroblock formalism proves very useful in applications where data exhibit various small impurities. DEFINITION 2. Let 01> 0 be fixed. An a-blockmodelof the n x n x k binary array A is an m x m x k binary array B together with an onto mapping $: { 1,2 ,..., n>-+ { I,2 ,..., m} for which
DhW), +Wj), 4) > 01, otherwise, where
is the density of the submatrix associatedto (i, j, q) by $. Thus B(i, j, q) is the characteristic function of D > ti (seealsoAppendix). Definition 1 is the special casewhere u = 0. More generally, 01> 0 is a measureof the tolerance of the blockmodel for zeroblock impurities; increasingOLcorrespondsto progressively more lenient zeroblock classificationuntil finally when a is greater than the maximum density (for a specified data array A) all imagesbecome zero imagesunder the chosen coding, corresponding to a blockmodel with trivial structure. A natural choice for KYwould be
BLOCKMODELS
33
a value which dichotomizes a naturally bimodal distribution of block densities in the data; seeTable 2 (suggesting0.2 < ti < 0.4) and Table 3, where o!is chosento be the grand mean of all block densities(seealsoSection 2E). C. BLOCKER and the Enumerationof Homomorphisms The first systematic approach used by White and his colleaguesto obtain blocking patterns in n x n >( k data consistedof an interactive computer program BLOCKER, originated by Gregory Heil in 1974 and implemented by him in APL (implementation described in Heil & White, 1976). To use the BLOCKER program, one begins with a data structure of k n x n matrices, each of which is constrained to be binary. The data analyst is required to supply a blockmodel “hypothesis” for each type of tie, i.e., an image matrix like those shown earlier for P and N, for eachof the k givm data matrices (for m blocks, km2 binary decisionsare thus made). Specifying blockmodel hypotheses thus entails a decision asto the degreeof refinement desired, i.e., the number of blocks m in the partition. The user is alsorequired to supply a lower bound on block size, which may be used to preclude various trivial solutions, e.g., (n - 1, 1) splits. The output from BLOCKER consists of an enumeration of the permutation(s) conforming to the specified hypothesis, i.e., permutations which (when applied simultaneously to eachof the k given binary sociomatrices)will produce a blocking compatible with the hypothesis for eachtype of tie. This output allowsthe investigator simultaneously to explore existence of solutions for the given hypothesesas well as the uniquenessof any particular solution. Notice, however, that if a solution exists at all, there will normally be many compatible permutations since the ordering of blocks and actors within blocks is arbitrary, as already noted; thus uniquenessmust be interpreted at the level of the partition of actors into blocks, not that of the permutation chosento representthe blockmodel when the data are shown in blocked matrix form (formally, the invariance group of the blockmodel is thus the automorphism group of the corresponding partition; see Ore, 1942). When used as we have indicated, BLOCKER furnishes a highly convenient way of testing conjectural blockmodels. Thus, BLOCKER may be used to falsify a particular blockmodel hypothesis as being unobtainable by any permutation and division of the data, or, alternatively, to render ambiguousa hypothesiswhich is the nonunique outcome of several quite different partitions. Also in connection with failures of uniqueness, a further distinctive feature of BLOCKER is the list of “floaters” supplied as output, These are actors whose block membership is indeterminate under the hypothesis fitting the data, i.e., those actors who may be assignedto two or more different blocks without violating the hypothesis. A frequently occurring example arisesin connection with the commonly observed twoblock hypothesis V = [: 3, where any individual in the top block (assumedto contain more than one individual) may be reassignedto the bottom block without violating the hypothesis. Formally, DEFINITION
3. Let A be an tt x n x k binary array and let B be an m x m x k
34
ARABIE,
BOORMAN,
AND
LEVITT
binary array, m < n, which is a blockmodel of A under Definition mapping
1 under the onto
+: (1, 2 ,..., n} + {l, 2 ,..., m}. If 1 < i < 71,i is a floater in the blockmodel iff there exists a second onto mapping 4+: (1, 2,..., n} --j {I, 2 ,..., m> for which ++(i) # 4(i), C+(i) = 4(j) for all j # i, and under which B is also a blockmodel
ofA. It is easy to see that a sociometric fill isolate, i.e., an individual who receives and sends no choices in any network, is always a floater among all blocks. As we have seen through the V pattern example, the converse is not true. Through its operationalization of the floater concept, BLOCKER thus provides an unusually explicit representation of ambiguity in social position, a phenomenon whose analysis had been sporadically foreshadowed in previous literature but had heretofore received much more ad hoc treatment (e.g., Coleman & MacRae, 1960, where it is left to the investigator to decide subjectively which individuals are to be considered floaters). The implementation of BLOCKER also develops something approaching the dual concept of a crystallizer-an individual whose block placement has a disproportionate effect on implying block assignments for other actors, subject to given data on prevailing network ties. Specifically, the first step of BLOCKER’s search for mappings 4 compatible with a given target configuration B is to establish a conjhit array (called CFL in Heil & White, 1976): DEFINITION 4. Let A be an 71 x n x K binary data array and B an m x m x k binary array whose status as a possible blockmodel of A is being investigated. The conflict array CFL = CFL(A, B) associated with A and B is a 4-array
CFL(c, d, i,i)
=
I
:,
if 39, I < q < K, for which A(i, j, q) = 1 & B(c, d, q) = 0 otherwise.
Thus CFL reports whether assignment of actor i to block c conflicts under the model with the assignment of actorj to block d, in the sense that a zeroblock would be violated for some tie q, 1 < q - K. For given i, it follows from this definition of CFL that the incidence of conflicts C(i) which i “initiates” (i.e., the number of nonzero entries in the 3-array defined by i in CFL) is less than or equal to
3s
BLOCKMODELS
where A, = C A(i, j, q) = (i, q)th marginal of A, B, = C f)(c, d, q) = qth marginal of the complement array cd? i3 obtained from B by setting &(c, d, q) = -B(C)
d, q) in Boolean arithmetic.
8, is the number of zeroblocks in the qth image and 8, , if the data are sociometric, is the number of choices made by individual i on relation q. If A(i, j, q) A A(;, j, q’) := 0 for all i and j and q # q’ then C(i) = D( i ) f or all i. From this result follows the intuitive conclusion that the number of conflicts engendered by i in nonoverlapping networks is a weighted linear combination of i’s activity levels in each network separately, where of the blockmodel hypothesis for netthe weights 8, are measures of the “tightness” work q. More generally, the bound D(i) may be used as a baseline measure of crystallizer status; more refined measures may be defined combinatorially by considering additional structure implied by the interaction of CFL and tentative block assignments. Despite these innovative features, BLOCKER itself has proved to have many limitations as an effective means for blockmodel exploration. Especially during preliminary acquaintance with a particular data set, the ingenuity required of an investigator specifying a plausible BLOCKER hypothesis can be exorbitant. In terms of the previous list of objections to early work on fitting particular target configurations (discussed above), BLOCKER answers the second objection (to ad hoc algorithms) by providing a uniform way of exploring arbitrary targets (blockmodel hypotheses), but fails to answer the first objection (viz., to fitting predetermined hypotheses), since no procedure is furnished to allow inductive specification of the proper hypothesis from the data. BLOCKER’s combinatorial basis is also a source of two crucial practical drawbacks. First, the program requires binary input data, thus excluding much useful information which may be contained in ordered choices or in continuous measures of network structure (e.g., Keller, 1951). Second, BLOCKER searches only for blockmodels which exact<_vsatisfy the given hypothesis, with no tolerance for the least zeroblock impurity (i.e., a: = 0 in Definition 2 above). As has now become clear from a wide variety of blockmodel applications, such rigorous demands are not uniformly achievable, and strict adherence to the zeroblock criterion may lead the investigator to discard a large number of otherwise highly informative or-blockmodels. Just as the essentially discrete problem of seriation (discussed above) is best approached using the continuous approach of the MDSCAL algorithm, it would similarly seem worthwhile to seek an approach to obtaining blockmodels which is based on a continuous model. Such intuition finds additional reinforcement in the impressive success of J. Douglas Carroll at Bell Laboratories in approaching various inherently discrete data analysis problems by what he refers to as “mathematical programming” techniques (Carroll & Pruzansky, 1975; Carroll, 1976). This method involves use of a variety of continuous techniques to approximate (as limiting cases) combinatorial solutions to
36
ARABIE,
BOORMAN,
AND
LEVI’M’
discrete structure problems, and often proves highly effective when purely combinatorial methods are impractical or unduly arduous to implement. D. CONCOR
and Blockmodels from Hierarckical
Clustering
The most widely used continuous method in existing blockmodel research has been the CONCOR algorithm. The formal properties of this algorithm have been presented in Breiger, Boorman, and Arabie (1975) (h enceforth cited as BBA, 1975), and only a brief summation will be given here. CONCOR is an hierarchical clustering algorithm which exploits the empirically observed fact that when the columns (respectively, rows) of a matrix Mi = [mj~(i)]nx,l are iteratively replaced by the column correlations, obtaining
mj,(i + 1) = correlation between jth and Kth columns of Mi
where f&(i) = k i m,,(i), i-1 then the limit 1;~ Mi exists except in certain anomalous cases (e.g., M,, a cyclic matrix or a matrix having at least one constant column), and may be permuted into the bipartite blocked form M,=
[
+1 I -1 --s-;-m--1
1 +1
1 ,
thus yielding a two-block clustering from the algorithm. To data, all known classesof exceptions to these convergence results are of measure0 in Lebesgue measurespaceof all real-valued squarematrices of dimension n. Except as otherwise noted for rectangular matrices where rows and columns are blocked separately, all later analyseswill be assumedto converge on columns,rather than on rows (thus continuing the emphasis of BBA, 1975, on received sociometric choices). Once the initial bipartition is obtained, the sameprocedure may be applied to split one or both of the two obtained blocks in turn, so that finer partitions may be obtained corresponding to the desired number of blocks. The following empirical fact makes CONCOR a procedure which has been highly effective in the search for blockmodels: When a sufficiently sparsedata matrix M, is permuted to conform to CONCORderived blocks, the permutation will generally reveal zeroblocks or near-zeroblocks in the data. Furthermore, extension of the method to handle multiple tie data is straightforward: it is only necessaryto “stack” the k given data matrices, each of dimension TI, to obtain a single kn x n matrix, and then to converge columns of the derived matrix
BLOCKMODELS
37
using CONCOR. The resulting partition will reflect contributions from each of the input matrices (see below for a discussion of the implied weightings associated with the stacking procedure), and may be imposed on each of these matrices in turn to obtain a consistent blockmodel across all types of tie (see also Appendix and BBA, 1975, for more extensive discussion of stacking). To recapitulate, CONCOR thus yields two distinct kinds of information about the data: (1) a dendrogram or hierarchical clustering representation of structure, based on correlations; and (2) a graph image (as illustrated earlier for P and N) which arises from collapsing the usually much larger network depicted by M,, (i.e., the raw data, which are generally not correlations) under either the strict zeroblock criterion (Definition 1) or an or-cutoff criterion (Definition 2). As a caveat to sociologists ,and social psychologists employing these methods, it should be noted that “hierarchical clustering” in the sense of (1) refers to the nesting of successive levels of blockmodel refinement, not to any actual or presumed identification of social hierarchy or dominance structure in the given data (e.g., in the sense of Landau, 1951, 1965). Regarding (2), observe that each level of the hierarchical clustering, i.e., of blockmodel refinement, will produce its own image matrices [such as those illustrated earlier for (P, N)]. As the degree of refinement increases, the successive image matrices more and more closely approximate the original network, until in the limit, where n blocks are used, each actor occupies a singleton block and the “images” are simply the original data. The opposite extreme is a single block comprising the entire population; in this case the homomorphism rj mapping data onto the image matrices is trivial. Thus, the problem of blockmodeling is one of choosing which level or levels of refinement generated through CONCOR provide the optimal degree of aggregation for understanding the data. To restate the problem mathematically, there is a natural lattice of blockmodels, based on the partition lattice of the individual actors (Birkhoff, 1967), and choice of a blockmodel involves a decision as to the desired level of elevation in this lattice ordering (see also below). Turning to the history of the way in which CONCOR was introduced to this problem, several comments are in order. First, the notion of defining similarity between two items or individuals through the product-moment correlations between their column vectors is a traditional one [in psychology, this technique of gauging similarity is sometimes known as “profile similarity” (Shepard, 1972)]. Katz (1947, p. 240) seems to have been one of the first investigators to advocate the usefulness of the correlation measure in relation to sociometric data. In the context of blockmodels, the idea of taking correlations between individuals is substantively motivated as a way of approaching the more abstract concept of structural equivalence, originally borrowed from pure mathematics (see Lorrain & White, 1971; BBA, 1975). Specifically, the search for structurally equivalent actors motivates one to look at the consistency of actors’ positions (as measured by the correlations), rather than at the connectivity of these positions, which had been the focus of much earlier effort in the clique detection area (Lute, 1950; literature reviewed by Hubert, 1974c, and by Arabie, 1977). Such an emphasis on positional congruence via correlation was the concept which led White’s colleagues Ronald Breiger and Joseph Schwartz to arrive at CONCOR.
38
ARABIE,
BOORMAN,
AND
LEVITT
The performance of the CONCOR algorithm relies, of course, on the convergence of the iterated correlations, as discussed above and in BBA (1975). Since publication of that earlier paper, new, though still partial, analytic results have been obtained which tend strongly to support the fact of convergence as a general phenomenon aside from measure 0 exceptional cases. These results are presented at length in Schwartz (1976) and will be only briefly recapitulated here. First, Schwartz is able to establish the convergence of the analogous iterative process where instead of replacing entries of Mi by (column) correlations to obtain M,+, , as is done in CONCOR, the entries are now replaced by the column centered covariations (i.e., by the centered covariances, up to a multiplicative constant, see Schwartz, 1976, p. 256, n. 3). Schwartz establishes:
THEOREM. The iteratiwe converges in the sense that
process (CONCOV)
;;i~ k’-25’C[jl
generated
by covariation
replacement
= Q,
where C[Ol = covariation WI
matrix of centered columns of raw data,
= jth iterate of the cowariation
I = identity
matrix
iterative
process, j = 1, 2, 3,...
of dimension n,
e = unit column vector of length n,
ST2 = (I - (l/n) eet) C[Ol, k > 0 = dominant The limit matrix of s-2.
eigenvalue
of the matrix
52.
Q has rank m, where m is the multiplicity
of k as the dominant
et&value
See Schwartz (1976, pp. 263-266). Note the necessityof establishingconvergence in a normed sense(i.e., through multiplication by the scalar normalizing factor kc-z’)), which is made necessaryby the unnormalized character of the iterative processusing covariations rather than correlations. The proof of the Schwartz result also castslight on the information contained in the limit matrix Q, which can be shown to depend only on the left dominant eigenvector(s) of 9. Schwartz emphasizesthat clustering on sucha basisdiscardsmost of the information contained in the full spectral decomposition of 8, and recommendsagainst the use of CONCOV for this reason(1976, p. 267) and alsobecauseit requires an iterative process. Note, however, that these arguments depend on the presumption that all the spectral information in n is already at hand, which is counterfactual in almost all applications where calculation of even the dominant eigknvector of G! is a laborious process, often performed by iterative procedures(e.g., Shafto, 1972, p. 6). We therefore prefer to view the utility of CONCOV as a still open problem. Turning now to CONCOR, Schwartz (1976, pp. 272-278) is unable to establish a convergence theorem analogous to that which he obtains for CONCOV. However, Proof.
39
BLOCKMODELS
he uses a combination of analytic and simulation methods to support a conjecture that the first (bipartite) split produced by CONCOR may closely resemble the sign pattern (entries >0 versus entries to) of the dominant eigenvector of a matrix P[ll, which in our present notation is
WI - (1in) 4 Ml , indicating that it may be possible to mimic the behavior of COSCOR by analyzing the eigenstructure of an appropriate matrix. Our earlier point regarding CONCOV, that a recharacterization of the algorithm as an eigenvector problem does not necessarily yield computational advantages,continues to have force in the parallel CONCOR situation. However, the Schwartz analysisopens the way for exploring direct relations between CONCOR and classicaltechniques in the tradition of principal components analysis. Thus a conceptual cycle is complete, with contemporary investigators of CONCOR contemplating a return to procedures similar to those employed by Lambert and Williams in their foreshadowing of blockmodels in the early 1960’s. In a second paper, which anticipates some of Schwartz’s analyses,Shaft0 (1972) reports similar results. He showsvarious ways of modifying CONCOR to obtain a class of related algorithms which may be expressedboth as eigenvalue problems and also as solutions to appropriately posed maximum problems, e.g., maximize vtMIv under the constraints vtv = 1,
vte = 0,
where v is a (real) column vector having n components and Ml is the first correlation matrix of the data [this is Shafto’s (1972, Method A, pp. 6-7)]. This line of development seemsa promising one, for it opens the possibility of interpreting the blockmodel construction problem in terms of a classof optimization problems for social structures. Meanwhile, it is not presently recommendedthat CONCOR, as an empirically proven technique, be rejected for clusteringpurposesmerely becauseits mathematical structure is still incompletely understood. Note, however, that use of the full information contained in the dominant eigenvector of M,(I - (l/n) eet) M, will normally (i.e., in the absence of ties among eigenvector components) suffice to establish a complete linear ordering of individuals that is compatible with at ieast the first CONCOR bipartition. This point is made by Schwartz (1976, pp. 277-278) (compare Shafto, 1972, pp. 2, S), and suggests a unification of the aims of seriation (see above) with those of blockmodeling (seealso Gruvaeus and Wainer, 1972; Hubert, 1974d). There is alsoan empiricist tradition in the CONCOR area,centering around a number of independent rediscoveriesof the basic convergenceeffect on which the procedure is based.The earliest such discovery of which we are aware was by Myron Wish in 1963
40
ARABIE,
BOORMAN,
AND
LEVITT
(personal communication). Although we do not know if Wish’s is in fact the earliest discovery, it appears improbable that the convergence of iterated correlations was observed before digital computers became available. Probably the first author to publish the result was McQuitty (1968; see also McQuitty and Clark, 1968) who also recognized that repeated applications of the bipartite division procedure would generate a binary tree. Unfortunately, McQuitty’s applications of the method were to data having very limited intrinsic interest, and with the exception of Shafto’s follow-up study (1972) the algorithm does not appear to have attracted much notice at that time. In the context of information retrieval problems, Salton (1968) suggested the iterative process of raising the product of a similarity matrix (not based on correlations) and its transpose to the Nth power to obtain “Nth-order associates” (a formally related construction was later independently developed by Breiger, 1974). Dichotomizing the resulting Nth-order matrix with a user-supplied cutoff value (Salton, 1968, p. 117) was suggested as the next step. However, Salton did not raise the possibility that some type of iterative convergence procedure might furnish the means for dichotomizing such a matrix, and his idea thus falls markedly short of CONCOR in this respect. Salton also considered the use of a similarity matrix of product-moment correlations, and furnished a discussion (1968, pp. 125-127) that suggests a correlation-based clustering procedure, although the details are not explicit. In a programmatic statement reminiscent of the Lambert and Williams procedure and also the Shafto-Schwartz analyses, Salton additionally considered an eigenvalue analysis (1968, pp. 135-139). Using the theory of reducible matrices, the analysis was intended to obtain clusters of items, ‘I... where the set of items can be partitioned into nonoverlapping subsets in such a way that within each subset a nonzero connection exists between each pair of elements, but no connection exists between a member of a given subset and an element that is not a member” (compare block diagonal form above). Although Salton’s work is thus suggestive both of the convergence of correlations and of permuted binary matrices, the actual product of anything resembling a blockmodel fails to appear in his 1968 book. Therefore, it appears that the closest anticipation of the approach of White and his colleagues is still that of Lambert and Williams (1962). E. Illustrative
Application
To illustrate the use of CONCOR on sociometric data, we briefly introduce data from Newcomb’s (1961) well-known experimental studies of a college fraternity. In each of two years, undergraduate male subjects were recruited in exchange for reduced room and board, and were organized into an experimental fraternity. At prescribed intervals during a semester, each subject was required to rank all other members of the population on the basis of “interpersonal attraction.” The raw data for each 17-man population therefore consist of a sequence of ordered choice matrices whose entries (i, j) indicate the location ofj in i’s sentiment ordering. To convert these matrices into binary matrices, White and co-workers proceeded to code each data matrix as two binary matrices: a first matrix coding as “1” only each individual’s top two choices and a second matrix coding as “1” each man’s bottom three choices. (A variety of alternative binary
41
BLOCKMODELS
codings were tried, and the pattern appeared robust.) For each week’s data, the two matrices thus obtained were stacked (see above) and CONCOR was applied to obtain a blockmodel partition. For present purposes, we will consider only selected weeks from the Year 2 sequence, reporting blockmodels at a three-block level of fineness. TABLE Newcomb’s
Second
Fraternity,
Showing
Density
3 and Image
Matrix
Patterns
Image
Densities Week
Weeks”
matrices Antagonism --~ .~
Like
.-
for Selected
~-
[ 1 0 1 100 1 0
0
1
1
3
--
----5
8
13
15
a Cutoff (see text).
used
to obtain
image
matrices
shown
is 01 = 0.156,
corresponding
to the grand
mean
Table 3 shows matrices selected to be at roughly equal reporting intervals from the time of formation in Week 0. All blockmodels shown are based on the CONCOR blocking obtained for Week 15, the final week of the study. Thus for each week the partition is (13, 9, 17, 1, 8, 6,4) (7, 11, 12,2) (14, 3, 10, 16, 5, 16), where the numbering of men follows that adopted by Nordlie (1958) in the original unpublished report. The left two columns of the table report the density matrices obtained from the data by dividing the number of “I ” entries by the size of the corresponding block (see Appendix). “Like” describes the top two choices and “Antagonism” the bottom three. The right two columns depict the corresponding blockmodel images with blocks of density <0.156 being coded as “0” and all other blocks as “1.” The chosen cutoff density (Y := 0.156
42
ARABIE, BOORMAN, AND LEVITT
is the grand mean of the sociometric choice sample, i.e., 851544[where 85 = 17 x 5 is the total number of choices allowed to each man (= 5) multiplied by the number of men, and 544 = 2 x 17 x 16 is the number of allowed entries in the choice arrays for positive and negative choices (no man was allowed to choose himself under the data collection procedure, so the principal diagonal must be excluded in each case)]. Alternatively, a frequency histogram of the densitiesover all weeks consideredshowsa clear discontinuity with 39 cells having density 30.20, 60 having density ~0.12, and only 9 having density in (0.12, 0.20). This distribution pattern supports choice of any a:between 0.12 and 0.20; ol = 0.156 roughly bisectsthe preferred interval. Observing the sequence,it is apparent that the socialstructure hasalready crystallized by Week 5: Density changesthereafter are minor, and no changesat all occur in the blockmodel images!By contrast, up to Week 5, there is considerabledensity fluctuation and the image matrices are also quite unstable. The equilibrium pattern of Weeks 5-15 is discussedat length in White et al. (1976, pp. 763-765) to which the reader is referred for substantive amplification. The robustnessof the blockmodel pattern obtained from this experimental population is quite striking, and raiseshopesthat future investigations of nonexperimental social structures will similarly reveal unexpected invariances of pattern. F. Summary In seeking the row-column permutation from which a blockmodel is ultimately derived, the vehicle may be either combinatorial (e.g., BLOCKER) or continuous in nature, with the latter category subsuming both clustering methods like CONCOR and more traditional multivariate methods (e.g., Lambert & Williams, 1962). We have argued above that continuous methods, including CONCOR, are the more widely applicable and useful, with combinatorial algorithms retaining a needed subsidiary place becauseof the operational versions they provide of a number of inherently discrete structural concepts (e.g., floaters and crystallizers). Below, we turn to some specific technical considerationswhich the use of CONCOR raises.
3. PRACTICAL CONSIDERATIONS A. Form of the Input Data As noted in the BBA (1975) discussion,the input matrix Ma for CONCOR is required only to be an m x n matrix of real numbersconsistently coding relations within a population of actors. It is not even necessaryto delineatea priori whether the relations so coded describe similarity/dissimilarity, liking/disliking, or any of the other traditional bipolar variables; in fact, we will argue below that the best results are obtained by the juxtaposition of many contrasting types of relations in the samestack, when the input data are entered in this way. In spite of this generality, most applications to date have entered only integer-valued matrices which are either binary or ordered choices. Moreover, it has become empirically clear that data in binary form (dichotomized if necessary,as
BLOCKMODELS
43
in the Newcomb example above) usually provide the best candidate for substantive interpretation through the blockmodel formalism. A similar advantage for binary data has been reported by plant ecologists using principal components analysis (Hill, 1974, p. 350; Austin & Greig-Smith, 1968). We would like to suggest a possible basis for this fact, and then propose various ways of preprocessing nonbinary data in order to obtain binary M . It is to be stressed that this part of the area remains in a largely premathematical state, though by the same token constituting the field where the next major mathematical advance is perhaps most likely to occur [compare the promising distribution results of Holland and Leinhardt (1970, 1974, 1977)]. The principal empirical advantage of binary data once a blockmodel has been obtained from CONCOR is that such data seem to enhance the pattern of contrasting densities between potential zero- and oneblocks (e.g., Table 2, as well as densities shown in the left two columns of Table 3). That is, although the presence of continuous data in M, poses no difficulty for computing correlations, cases blockmodeled to date indicate that such data are usually less likely to yield a bimodal distribution of block densities (see also BBA, 1975, pp. 353, 355).a Because, as we have seen, such an obtained distribution facilitates a clear distinction between zeroblocks and oneblocks, the interpretation of blockmodels is much easier in these cases, and efforts to produce a binary M, seem justified. Although this argument is an empirical conjecture, there are certain examples of nonbinary data for which theoretical reasons can also be adduced. Thus consider an M, having both positive and negative entries (e.g., where the actors report both positive and negative affect toward each other, and both types of ties, presumed to be mutually exclusive, are summarized in the same I&,). Once the permutation of Ivr, has been determined through CONCOR, if the densities within blocks are naively computed, then positive and negative entries may tend to cancel out, giving a spurious impression of zeroblocks. Accordingly, it seems desirable to fractionate such a matrix into distinct matrices of positive and negative affect, stack the matrices, and execute CONCOR on this stack (as was done in the analysesof the Sampsonmonastery data [1969] reported by White et al., 1976). Following determination of a blockmodel partition by the algorithm, a common permutation may then be applied separatelyto each of the stacked matrices in order to compute the block densitiesand interpret the pattern. Another classof frequently encountered nonbinary data is ranked choice data (e.g., forced-choice sociometric data where choices are recorded in order, top man first, etc.), or similarly, category scale data. The objection to naive application of CONCOR to such data matrices is a classicalone, namely that the “equal-intervals” assumptionabout category scaleor rank-order data is questionable (Lute & Galanter, 1963, pp. 261-264; Stevens, 1971). In blockmodel applications involving such data, the data have usually been initially dichotomized on the basis of choice level as in the Newcomb example (e.g., only the top two choices are taken, and each is coded as “1”). When possible, category scale data are fractionated into several distinct ties and entered as a stacked 6 A potentially usefulsignificancetest for bimodalityhasbeendevisedby J. B. Kruskal (see Giacomelli,Wiener, Kruskal, Pomeranz,& Loud, 1971).
44
ARABIE,
BOORMAN,
AND
LEVITT
binary matrix (as was also done in the Newcomb case). In many cases, these preliminary coding procedures should also improve the robustness of the data, in view of the probable unreliability of intermediate choice levels across replications and the tendency of such intermediate-level ties to be unstable in participant-reported social structures. An example of fractionation is given by Breiger’s (1976, p. 119) preprocessing of the network data on researchers studying hypothalamic regulatory mechanisms. These data were reported by Griffith et al. (1973) in an unpublished study from the Drexel University Library Science Department, and consist of questionnaire responses from a random sample of 107 individuals who were asked to rate the extent of interaction with fellow researchers on a seven-point category scale of professional contact. The extreme categories of the scale had verbal labels of “Present association (is) continuing personal contact” and “Unaware of man and his work.” (Griffith et al. do not indicate whether all researchers were in fact males.) It would have been straightforward, of course, merely to consider the seven-point category data as a proximities matrix and to enter those data as a square M,, for CONCOR. Such a procedure (perhaps with some means of symmetrizing the input matrix) is usually what is done, for example, when multidimensional scaling and related procedures are being employed (see Levine, 1972). However, Breiger fractionated the data into several distinct matrices, such as “mutual contact,” “ asymmetric awareness,” and “symmetric unawareness.” By selecting which of these fractionated matrices to enter as CONCOR input, he was able to obtain highly interpretable and quite striking results (Breiger, 1976; see also Arabie, 1977 and results reported in White et al., 1976). Although it is not obvious that the procedures used by Breiger are canonical (for example, he suppresses “asymmetric contact”), there is a sound and very useful strategy at issue here. Specifically, as stressed at the beginning of this section, the most interesting interpretations are likely to follow when a blockmodel is formed on multiple types of network displaying maximal relational contrust, e.g., symmetric ties versus asymmetric ties, or positive sentiment versus antagonism (see also Mullins et al., 1977). Unfortunately, in all too many sociometric data sets, even the most ingenious fractionation fails to generate sufficient contrast to provide interpretive interest, typically because fractionation produces unacceptably low densities in the raw data (for example, because of zero columns, causing the correlations which CONCOR requires as input to be undefined, as well as other interpretive difficulties). By contrast, the best existing data sets, such as the one represented in Sampson’s massive empirical study (1969), furnish multiple axes for contrast arising from competing and often contradictory networks. The experienced data analyst will have noticed that stacking such contrasting matrices to create M, will assign implied weights to each component type of tie if the component matrices are differentially sparse (see also discussion in Schwartz, 1976). To date, there has been no systematic investigation of weighting effects arising through CONCOR, and each matrix in the column stack has been entered without correction (such as centering on column means or some alternative weighting-normalization method). It seems clear, however, that weighting effects in blockmodels is a substantively important topic which should be investigated in the future. For example, it is by no means clear that faihre to have heard of a coinvestigator (e.g., Breiger’s symmetric and asymmetric
BLOCKMODELS
45
unawareness) should be assigned the same weight as maintenance of a professional contact, which requires effort and time investment. Theoretical ideas on appropriate weighting procedures may be derived from the job information flow model of Boorman (1975), where the parameter A > 1 serves as a natural weighting factor giving strong ties predominance over weak ones when both are entered in the same stack (see also Delany, 1976). B. Diagonals As a consideration in constructing blockmodels, we are concerned with diagonals at two conceptually and physically distinct levels: (1) in lu, , and (2) in the reduced image matrix which is the essence of a blockmodel hypothesis. In both instances we are talking only about a data base where the rows and columns index the same items, since an m x tt matrix (m # n) in effect has no (principal) diagonal. Turning first to diagonals in the raw data matrix lu, , it is often the case that the diagonals are either undefined or in some other way unavailable. Diagonals are not furnished at all, for example, in the several hundred sociomatrices contained in the Davis-Holland-Leinhardt data bank (Davis, 1970; Davis & Leinhardt, 1972). Schwartz (1976) has presented arguments as to why diagonals should not be included in the computation of correlations in CONCOR, even when diagonal entries are defined in the data. In the applications presented in BBA (1975), the diagonals were included for purposes of correlation (each sociomatrix being assigned a zero diagonal), but not for calculating image matrix densities. In retrospect, it would perhaps have been preferable to exclude the diagonals from both phases of the analysis. To elaborate on the mechanics of excluding the diagonal entries from correlations, consider first a single rz x z matrix. In correlating columns i and j, the four entries (i, i), (hi), (i, 4, (j, j) are excluded (i #i), and the value (8 - 2) is used in the Pearson correlation formula rather than n. These same entries will also be omitted in the computations producing Mj+, from M, for i > 1. For a stacked matrix consisting of K types of ties and thus having dimensions Kn x rz, each of the K diagonals in M, is ignored during computation of Mi . Thus, there are K(n - 2) entries for each column being correlated when M, is generated. For the subsequent sequence of matrices produced by iterated correlation, M, through M, , each having dimensions n x n, there is only a single diagonal to ignore in each iteration. It should be noted that when a constant column arises in a matrix, the CONCOR algorithm (unlike BLOCKER) terminates prematurely, faced with the zero denominator of any correlation coefficient which involves that column (BBA, 1975, p. 335). While it is possible to get around this formal difficulty by defining the correlation between any nonconstant column and a constant column to be 0, this artifice creates further problems with the (- 1, +l) blockability of the limit matrix M, as well as chaining problems in any clustering soIution inferred. When diagonals are excluded from the columns for computing correlations, the vulnerability of CONCOR to constant columns is enhanced. Specifically, each binary column of a sparse matrix must have at least two off-diagonal unities if the procedure is to be executed without including the diagonals in the correla-
46
ARABIE,
BOORMAN,
AND
LEVITT
tions. (A single off-diagonal entry in row i of column j will be excluded when columns j and i are correlated, thus leaving j as a constant zero column.) Thus, sociomatrices collected through the use of one-choice procedures (see cases cited by Holland & Leinhardt, 1973) are not amenable to CONCOR analysis, although this limitation is probably inconsequential because such data in any case have doubtful utility for other reasons (Holland, 1977). At the level of the blockmodel resulting from the image matrix, there is also a principal diagonal even though the principal diagonal of 1M, may itself be undefined. The investigator who is seeking cliques among the actors would expect to find oneblocks on the diagonal for a tie of positive affect (e.g., alliance, friendship, etc.). Thus, in this case, blockmodel analysis subsumes clique detection (discussed in Section 1) as a special case; even though CONCOR does not explicitly search for cliques or for a block diagonal pattern, cliques present in the data will normally be identified by the algorithm. For other types of tie (e.g., “aspires to resemble someday”), a oneblock on the diagonal of the image matrix seems intuitively implausible. For this reason, the density of diagonal blocks in such a matrix may help to suggest a useful cutoff density for discriminating zero- from oneblocks in the image. C. Goodness-of-Fit As in the testing of any hypothesis, some measure of goodness-of-fit is desirable for a proposed blockmodel. Although several alternatives are available, none has yet seen extensive usage in blockmodel applications, perhaps because of White’s often-voiced skepticism about statistical inference procedures. Our purpose in the present discussion is to mention some of the relevant literature and to outline questions which such methods may eventually answer. A particularly noticeable gap is the absence of any systematic method for specifying the most informative level of blockmodel refinement, even though it is intuitively clear that both extremes of fineness and coarseness are equally uninformative and an intermediate optimum should therefore be definable. This is a special case of the general problem of selecting clusters from “appropriate” levels of the dendrogram obtained by certain hierarchical clustering methods (see Hubert, 1973; Ling, 1975; Baker & Hubert, 1976). One possible line of attack on the general goodness-of-fit problem for blockmodels may proceed by adapting results of Hubert, e.g., as reported in Hubert and Baker (1978). Although this approach will be noted to have significant drawbacks for the present problem, it is worth outlining, both because of its interesting null hypothesis and because the technique is a further illustration of the utility of permutation methods. Specifically, consider a single n x n data matrix M, assumed to be binary for present purposes, and assume a given blockmodel image matrix having b blocks. To apply the Hubert-Baker procedure, it is first necessary to convert the blockmodel matrix into a matrix having the same dimension (n) as the data, thus completely filling in oneblocks (except for the diagonal, taken to be zero) as well as zeroblocks. In the terminology of White (BBA, 1975, p. 332), the test will thus proceed under a “fat fit” rather than a “lean fit” hypothesis. Let B denote the expanded model matrix having dimension n.
41
BLOCKMODELS
To gauge agreement between M and B, assume that M and B are compatibly arranged, e.g., for the case of the 2-block model on the population of 7 individuals presented earlier (Sect. 2B) (taking the P relation only):
P=
0 0 0 0010000 0100000 1100000
110
0 0 0 o/o 0 0 0 O/l -0 0 0 0:o
0 o-
, 1 1 0 0 1 0
B=
0 1 1 I;0 1011000 1101000 1 1 1 0’0 I 0000011 0 0 0 Oil 0 0 0 O!l
0 0
0 0
*
0 1 1 0
Under the natural l-l correspondence between rows and columns of M (P in this example) and B, Hubert and Baker now define a number of measures of agreement between the data and the model, e.g., the correlation coefficient r(M,
B) = nMB = nM = n, = nm = ng = ?I =
{n(n - 1) %a7 - %fRd/(~M~B~~~S)“‘, number of l’s common to M and B, number of l’s in M, number of l’s in B, number of off-diagonal O’s in M, number of off-diagonal O’s in B, population size.
The problematic issue now becomes that of seeking to evaluate the obtained I’ (which in our example using P will be 0.368) in a significance-level sense. Here the HubertBaker approach is agreeably congruent with assumptions about social structure which underlie White’s approach (e.g., White et al., 1976). Specifically, the null hypothesis is that there is a random assignmentof men to blocks, with the distribution of block sizes assumedto be known and fixed. Then given the socialstructure evidencedby M, together with the blockmodel determined by B, Hubert and Baker (1) show that E(r) ==0, where E(I’) is the expected correlation between B and M, randomized over possiblel-l assignmentsw : {I, 2 ,... , n} -4 (1, 2,. .., a> which establish bijections between rows and columns of M and those of B; (2) compute the variance V(r) of this correlation. The Z-value r/( V[Q112 may then be quantitatively studied to assesssignificance levels for fit, either on the assumptionof asymptotic (large n) normality for the unknown Z-value distribution having mean 0 and variance 1 (an assumption about which Hubert and Baker are properly cautious), or by various standard inequalities of statistical theory, such as Cantelli’s inequality (asserting that for a one-tailed test the true significance level associatedwith a specific observed Z-value is bounded above by l/(Z2 + 1)). There are two convincing features of this procedure. First, and most importantly, it is sensitive to all social structure that is implied by the data matrix M. This fact is 48+7/I-4
48
ABABIE,
BOOBMAN,
AND
LEVITT
most obvious if we regard the calculation of V(r) ( as well as of E(r)) as being performed for the fixed ordering of B shown above and for all n) possible permutations of Me Because we are thus limiting our attention to relabelings of the actors which otherwise leave all features of the social structure invariant, the chosen null hypothesis does not introduce extraneous (and perhaps impossible or imaginary) social structures into the assignment of a significance level to the comparison between M and B. A particular aspect of this feature of the test is that all matrices M* derived by M through permutation, and over which the variance is computed, will possess similar role structures in the sense of semigroup algebra (see below). Second, goodness-of-fit is presently being assessed relative not only to the number of blocks in the model B, but also to a full distribution of block sizes. Thus the present test avoids pathologies arising, for example, when an (n - 1, 1) split is regarded as a “two-block” model. There are also a number of difficulties. The existing approach has not yet been generalized to three-way data, notwithstanding the fact that most of the important blockmodels applications proceed in terms of multiple networks (e.g., Sect. IC above and Table 3). The lack of information about the distribution underlying the Z-values is an annoyance, since bounding inequalities like Cantelli’s are typically extremely weak. It is also unclear how Z-values will compare across different blockmodels of the same data, or the same blockmodels of different data; if, for example, the assumption of normality in the Z distribution is a poor one, or a poor one for fixed n as the number of blocks becomes small or becomes large, difficulties of comparability may become formidable. Finally, there should be some uneasiness in working with a procedure which is tantamount to the assumption of fully complete oneblocks. As we have already pointed out, there are many substantive reasons to suspect that “fat fit” (filled-in oneblocks) is usually a poor assumption, and a test of goodness-of-fit which proceeds on a fat fit hypothesis may be creating a straw man. This last problem has been directly attacked in a highly special case by White (1977), whose objective is to specify the chance probabilities for certain image matrices (e.g., the P or N patterns of Section 2), assuming two-choice sociometric data in M,, and a pure zeroblock criterion (a = 0). The null hypothesis for a given blockmodel was that its pattern had been randomly generated, with no bias (for example) toward transitivity or interaction between data matrices for different types of tie. Of course, this is a quite implausible null hypothesis for almost all social structures, although it successfully avoids assumption of high density in the target matrix B. One further special aspect of the goodness-of-fit problem calls for special mention because of the frequency with which it presents itself in practical applications of CONCOR. Pursuing further differences between CONCORand BLOCKER-derived structure, we noted in Section 2 that with the former program, the user must specify whether a 2-, 3-, etc., blockmodel is desired as part of the blockmodel hypothesis. BLOCKER then reports the permutation(s), if any, satisfying this hypothesis on the data. The situation is quite different for CONCOR, which operates as a divisive hierarchical clustering algorithm (see Lance & Williams, 1967a). Specifically, CONCOR yields a binary tree, having a minimum of ([log, nl + 1) (here [ 1 denotes the greatest
49
BLOCKMODELS
integer function) and a maximum of tt separate levels, where n is the number of actors, and the root and termini are included in the count of levels (see also Boor-man and Olivier, 1973). Such a tree will yield more than one distinct m-block blockmodel when m is not a power of 2. For example, if we have a 4-block model derived from two successive CONCOR splits, viz.:
There are two different ways of collapsing the 4-block model to obtain a 3-block model:
/c\
versus /x
In the past, the choice between these two alternatives has been made largely on the basis of which of the blockmodels seemssubstantively more interpretable. However, a measureof goodness-of-fit telling, for example, which of the two competing models leads to a larger chi-square value, would be desirable. No such measureexists at this time, although it seemsclear that empirical investigators tend to give preference to the 3-block solution which displays greater contrast between low and high density blocks obtained. The lack of any rigorous basis for the selection acquires additional force when it is noted that CONCOR yields with its clusters no numerical values for further interpretation, thus contrasting with the ordinal alpha values in Johnson’s(1967) presentation of the single- and complete-link methods, or with the linear weights in the Shepard-Arabie (I 975) ADCLUS algorithm for nonhierarchical clustering (Arabie, 1977). D. MiscellaneousConsiderations The present discussionof blockmodels began with square binary matrices, where the rows and columns index the sameset of actors. As noted elsewherein this paper, for the alternative case of m x n matrices, whose rows and columns index disjoint sets (see the example from Lambert and Williams, 1962, in Table l), CONCOR can be separately applied to the rows and to the columns. For such data, m usually doesnot equal n, so that we have used the descriptor “rectangular” for those matrices. The reader who is familiar with the theoretical history of social network analysis during the 1950s and 1960smay be disturbed by the manifest indifference of CONCOR toward the distinction between rectangular and “true network” data, the latter being data which are formally equivalent to a directed graph (Berge, 1962).
50
ARABIE,
BOORMAN,
AND
LEVITT
In the history referred to, investigators with a structuralist bias frequently tried to distinguish network analyses from conventional contingency table methods associated with attribute cross-classifications [e.g., see related discussions in White (1970) for the case of mobility data]. Formally, of course, contingency tables are a case of rectangular data in the sense we have specified, so that any attempt to preserve this distinction in the CONCOR formalism is lost as soon as the algorithm’s applicability to the rectangular case is recognized. However, on the basis of better empirical understanding now available (and in part made possible by the development of blockmodels), the uniform handling of the square and rectangular cases actually appears as an asset of the procedure. As has progressively become clearer, what the early theorists actually found objectionable was not the concept of attribute-by-attribute or item-by-attribute classification per se, but rather the almost invariable association of such classifications with predigested aggregations of both columns and rows into a few a priori categories (thus rows might be ethnicities and columns political party affiliations; compare general comments on the pitfalls of too-ready contingency-table aggregation in Batchelder and Narens, 1977). By contrast, whenever CONCOR is applied to rectangular data, the same principle already made clear for square matrix cases continues to apply without change: The data are allowed to determine their own most appropriate aggregation (i.e., the blocks that emerge), and congruence between the obtained blockings and various other categorical groupings is enforced by the data rather than imposed upon them. An example developed in BBA (1975) is based on Levine’s (1972) study of d irectorship interlocks in American industry, where the primary data are recorded in a rectangular matrix with the rows indexing corporations and the columns indexing banks (see also BBA, 1975, Figs. 4-6). Note that the idea of a stacked matrix where each of the K stories is an n x n matrix on the same set can be generalized to rectangular data. Specifically, one could start with K matrices each of dimension m x 1z, recording relations between a fixed row set of size m and column set of size n. These k matrices could then be stacked in two alternative ways to produce two separate stacked matrices of dimensions km x n and m x kn, respectively. Applying CONCOR to columns and to rows, respectively, one would obtain blockings of both index sets incorporating data from the full three-dimensional input array. Such a generalization of the stacking procedure has not yet been explored empirically, but there will normally be no difficulty with the convergence since a square matrix will be obtained in each caseafter just one iteration. One final comment concerning the analysis of rectangular matrices is in order. The facility for clustering columns and, separately, clustering rows, is of course not unique to CONCOR and is in fact shared with numerous other clustering procedures (see Hartigan, 1975, pp. 251-298, for a discussion).An important point not made in BBA (1975) is that to obtain a blockmodel through CONCOR (i.e., to get the arrangement of densitiesfrom M,), both columns and rows must be partitioned for an M,, of dimensions m X n. Schwartz (1976) has faulted BBA for stating (1975, p. 336) that in successivesubdivisions of blocks (e.g., in passingfrom a 2- to a 4-block model with CONCOR), the user returns to the columns of M, . Schwartz notes that the first-correlation matrix (MJ contains all the required information to obtain further subdivisions. His observation,
BLOCKMODELS
51
of course, is correct and highlights the fact that CONCOR is only one way of analyzing the proximity matrix M, , with alternative courses being scaling (BBA, 1975, Figs. 6, 7) or nonhierarchical clustering (Shepard & Arabie, 1975). From a computational viewpoint on square data, however, note that it is more economical to return to MO rather than retaining M, in core, since M, must be retained in any case for the purpose of computing densities and blockmodel images once the clustering is complete, whereas M, is no longer needed after M, has been computed. An aspect of data analysis rarely mentioned in discussions of clustering algorithms is the problem of missing entries (but see Hartigan, 1975, p. 267), although it seems probable that capabilities for handling missing data could be developed for several of the more commonly used algorithms. The earlier procedure for excluding diagonal entries could be generalized to handle this problem in the case of CONCOR. The robustness considerations this generalization raises remain to be analyzed (for promising results along related lines, see Devlin et al., 1975). An alternative strategy for missing values in CONCOR applications would be to supply zeros for the missing data entries. The rationale is that the resulting analysis will tend to be a conservative one, in the sense that if interpretable blockmodel structure exists this structure will probably be revealed by the default of supplying zeros-whereas supplying ones may maskblockmodel structure that is objectively present by introducing impurities into zeroblocks (see also the more general discussion of sociometric masking in Holland & Leinhardt, 1973). In the description of BLOCKER, we noted that this algorithm has a unique facility in explicitly enumerating “floaters” in the social structure. CONCOR has no such provision and some effort must be expended to ascertain the identities of the less stable actors in the blockmodel representation. One useful check on these identities, used in White et al. (1976), is to apply CONCOR first to suggest a blockmodel and then to enter this obtained blockmodel as input to BLOCKER, in order to determine whether the obtained solution is in fact unique. If the solution is not unique, BLOCKER identifies the floaters, so that one has a useful check on the presence of positional ambiguity relative to the hypothesis stated. In BBA (1975, p. 348), we compared the CONCOR dendrogram with those obtained on the Bank Wiring Room group (Homans, 1950) for the single- and complete-link methods. We observed that most of the discrepancies between clustering solutions at the 4-block level involved actors designated by Homans (1950) as being located ambiguously in the social structure [similar ambiguity of placement was also apparent from an MDSCAL scaling of M, (BBA, 1975, p. 372), which showed ambiguously placed individuals as being located in interstices between well-defined clusters comprising the Homans cliques]. Recently, we have observed an occasional instability in the mathematical behavior of CONCOR which may point an alternative way of determining floaters with this algorithm. Specifically, if one begins with Mr for the Bank Wiring Room group calculated to seven or more decimal places, the dendrogram given as Fig. 1 of BBA (1975) is obtained. However, if M, is computed to only five decimal places, then actor W6 is no longer clustered with S2 but is incorporated into the cluster (W7, W8, W9, S4). The labeling of W6 as a floater is compatible with Homans’ verbal presentation (I 950, pp. 71, 73, 132-133, 138-139), and this instability in placement
52
ARABIE, BOORMAN, AND LEVITT
of W6 is therefore substantively reasonable,In view of the previously noted connection between CONCOR and principal componentseigenvectors(Sect, 2D ff.), the instability is not surprising given the known sensitivity of many eigenvalue methods to truncation in machineimplementations(e.g., Wilkinson, 1963; 1965,pp. 110-188). Notwithstanding this origin in computational accuracy, the instability furnishes a further possibleavenue for investigating the mathematical properties of CONCOR and its potential for designating floaters.6 E. Prospects:Methodology Becauseof the essentialsimplicity of CONCOR from a user’sstandpoint, blockmodels are remarkably easy to obtain-considerably more so, for example, than are multidimensional scaling solutions for data setsof similar size. Moreover, neither CONCOR, nor BLOCKER, nor any other single algorithm, is indispensableto implementation of blockmodels. Many alternative hierarchical clustering methods are widely available, each of which may also be applied to Mr to obtain blockings. A task for future study would be to investigate CONCOR’s position in taxonomies of such algorithms (e.g., Lance & Williams, 1967a), as well as CONCOR’s particular advantagesand limitations. Becauseit is all too simple to obtain a proliferation of blockmodels, many problems of evaluation may be expected to ariseasthe areaadvances.Far too little is yet understood about the sensein which one blockmodel is valid and another is not. We have already touched briefly on one aspect of evaluation, the statistical assessmentof blockmodel fit relative to some statistical null hypothesis (Sect. 3C above). One important further problem, also bearing on evaluation issues,is that of comparing different blockmodels, whether basedon the samedata or on entirely different populations. Since the ordering of blocks in a blockmodel is arbitrary, comparison questions are not immediately amenableto direct Hamming or other metric comparisonsbetween matrices (Boorman, 1970; Mirkin & Chernyi, 1970; Boorman & Arabie, 1972). One approach to these problems has been proposed by Boorman and White (1976) and has its origins in formal ideason the generation of compound roles in complex social structures. Any blockmodel may be used to generate a semigroup of Boolean matrices, which may be substantively interpreted as an accounting of all possiblecompound roles implied by primitive (generator) roles in the data (thus: my friends [PI; my friends’ friends [P”]; my enemy’s friends [NP]; and so forth). Formally, define: DEFINITION 5. Let B be an arbitrary m x m x k binary array, e.g., as determined by a blockmodel. The BooleansemigroupSR(B) is the set of all Boolean relations which may be obtained by multiplying finite products of the k generatormatricesconstituting B, which may be separately designatedGi , Gs ,..., Gk . Obviously, the classicalprototype of this construction is transitive closure. If ordinary 6 Mr. John Delany of Yale University has begun investigation solutions, using artificial data baaed on Table 1 in BBA (1975).
of the stability of CONCOR
53
BLOCKMODELS
arithmetic matrix multiplication is used (e.g., Lute, 1950), the semigroup obtained will be infinite in almost all cases. The finiteness of SR(B) obtained by Boolean multiplication follows from the fact that there are only 2mBdistinct binary relations on a set of size m. In practice, the computability of SR(B) depends on the semigroup having a size which is much smaller than 2m*. In the empirical cases analyzed by Boorman and White (1976), the obtained SR(B) seId om had sizes larger than 30, even when m was 5 [S-block models of the Sampson (1969) data]; see also Boorman (1977) f or analytic calculations in the highly special case of a regular tree. The comparison of blockmodels may now proceed by means of the purely algebraic analysis of relations between the associated Boolean semigroups. Given two arbitrary binary arrays, B, of dimensions n x n x k and B, of dimensions p x p x K, assume only enough outside knowledge of the kind of information coded in the generator matrices to establish a fixed 1-l correspondence between generators across the two arrays. Without loss of generality, use the same symbols Gr , Gz ,..., G, to denote the matched generators in either case (thus Gr might report alliance, Ga antagonism, G, information transfer, and so forth). Then both SR(B,) and SR(B,) may be interpreted as homomorphic reductions of the same free semigroup FS(G, , G, ,..., Glc) over the generator symbols Gi (Clifford & Preston, 1961; Crowell & Fox, 1963). Using one of the basic theorems of universal algebra (Cohn, 1965), the quotients of FS(G, , G, ,..., G,) will be endowed with a natural lattice structure defined by homomorphic reductions (i.e., congruence relations on FS(G, , G, ,..., Gk)). Then the natural algebraic measure of similarity betweenB1 and B, will be the algebrawhich is the joint homomorphic reduction of SR(B,) and SR(B,) in this lattice, i.e., the largest possible semigroup which is a common reduction of SR(B,) and SR(B,). If numerical similarity measuresare desired, as will be the casefor most applications, a variety of natural measuresare suggestedby the same construction, e.g., as obtained by adding the entropies of the partitions of SR(B,), SR(B,) which are induced by the inversesof the homomorphisms7, 0
SW,)
NV,
SW%)
v B,)
where JNT(B, , B,) is the joint reduction (see Boorman & White, 1976, p. 1422, for details). The power of the joint reduction algebra JNT(Br , B2) as a basis for comparison between the blockmodels B, , B, is evidenced by the fact that the algebraic procedure may be employed regardlessof whether or not B, and B, describethe samepopulation, or even possesssimilar numbers of blocks. Two independent algorithms for calculating the joint homomorphic reduction of two finitely presented semigroupson the samegenerator symbols have been developed in APL by Harrison White (JNTHOM algorithm) and by Francois Lorrain (MASTERGLB algorithm).
54
ABABIE,
BOOBMAN,
AND
LEVITT
F. Prospects: Substantive Theory From a substantive standpoint, the essence of a blockmodel consists in its complete accounting of social relations within a system of positions, developed simultaneously for multiple types of tie. Classical theory within either social psychology or sociology has no close substitutes for this level of detail; conventional descriptions of complex social structures normally pay attention only to a highly limited fraction of the total number of relations actually present. Ambiguity is thereby generated, and attempts at summary become controversial. Only too frequently, a complicated theoretical analysis is rendered unworkable solely because “hierarchy,” or “servitude,” or some similar concept denotes distinct kinds of social structure in two different settings [compare Bloch’s assault (1960) on the comparison of land tenure systems in France and England in the Middle Ages]. By furnishing a systematic language for relational description, it is to be hoped that blockmodels may provide a substantial antidote to many traditional ills of social structural analysis. To return to our introductory theme, that of permutation methods and the sense in which they leave the data invariant, another aspect of blockmodeling is suggested. Blockmodel propositions (thus: Block 1 is the ally of Block 2, but Block 1 is internally a clique whereas Block 2 is not, etc.) have a transparent quality, in the sense that the reader may refer to the data and observe directly whether the proposition finds support (see again Tables 1 and 3). This essential simplicity of blockmodel hypotheses stands in marked contrast to the results obtained by most data analysis procedures. When applied in sociology and social psychology, such procedures invariably focus attention on relationships among highly derived measures-factor loadings, partial and multiple correlations, structural equation coefficients. Once such a level of aggregation has been achieved, it usually proves impossible to unravel and the only feasible course is the further manipulation of aggregate quantities. By contrast, blockmodels are always able to recover the reference standpoint of the individual, which may be evaluated against the communal viewpoint assigned by the blockmodel to this actor’s block. To relinquish this standpoint is to concede the greater part of the small advantage which mathematical sociology and social psychology now possess over the far more imposing edifice of mathematical economics, namely, the promise of the former fields to proceed with analyses of concrete populations so alien to the models of the latter area (compare observations in Koopmans, 1957). It is appropriate to conclude with a number of substantive challenges relevant to the future of mathematical analysis in the area. Significantly, each issue involves data as well as formalism. First, there is the problem of moving beyond affect ties. Subjectively important as these may be in determining the “mood” of a social structure, there is overwhelming evidence, much of it derived from depression economics, that much social structure persists regardless of how the actors feel about it or about each other. Other types of tie provide the sinews of everyday functioning: contact, alliance, command, legal liability, and debt are just a few of the possible examples. On the other hand, no one has ever reported complete networks for most of these examples, perhaps largely because such
BLOCKMODELS
55
networks will tend naturally to outbreed (so that it is hard to restrict data collection to a small population) and also becausetie definitions are hard to establishconsistently acrosseven moderately far-flung populations. Second, there is the problem of relational contrast. As we have already seen, the most interesting blockmodel interpretations are those built around contrasting images those internal patterns differ radically. Balance theory is the classical illustrationi (P versus N), but by no means the only case. Other examples are found in Breiger’s study (1976), already discussed,as well as in Boorman’s (1975) combinatorial model, which pits strong against weak ties, Nevertheless, the majority of sociometric studies completely fail to generate contrast (e.g., forced-choice studies recording only top choices by each individual). Moreover, contrast is not always easy to reveal in a nonexperimental setting. White’s (1961) study of managementin a small company encountered major difficulties in extracting negative sentiment choices from managers,most of whom presumably feared the consequencesof revealed factional strife to their own positions. In the future, attention should be directed to such concealmentand masking effects, as well as to the general problem of analytic network definition. Finally, there is the problem of macroscopicsocial structure. Even the scarceexisting work on large-scale networks (disproportionately owing to historians: e.g., Badian, 1958) leads to fresh vantages on some of the most promising areasin social science: among others, the structure of elite accessand recruitment (White, 1970), small worlds and their manipulation (Travers & Milgram, 1969), grey areas between formal and informal organization (Williamson, 1975). Even the form of the proper questionsto be askedremains unknown, or is only vaguely guessed,perhaps becausethe global pattern of a large, open network is not directly apparent to any one participant. Instead, the investigator is first thrown back on the supporting culture: the prevailing wisdom as to the attributes of various positions and their interrelations. Throughout its history, social scientistshave assignedfar more than due credit to such wisdom. Even when common knowledge about the structure and functioning of the system is tolerably correct, unperceived regularities will exist, if only becauseeach participant is consigned to a single highly local vantage point. Blockmodels should furnish a promising means for new discoveriesof global pattern and invariance.
APPENDIX
OF DEFINITIONS
Note. Blockmodels are a comparatively new technical area, and many of the early papers of White’s group did not use consistent terminology. For this reason,the present glossaryhasbeen somewhataugmented to include cross-referencesto certain synonyms. In all cases,the term we prefer is that used in the present text. Block. (1) A subsetof rows or columns of a data matrix on which a blockmodel has been imposed, i.e., a cell or equivalence classof the partition of the actors. Alternatively, (2) a delimited submatrix of such a data matrix which is determined from the blockmodel or partition of the actors. (Note that such blocked submatricesare not generally of equal size, which is one of the reasonsfor reporting densities,q.v.).
56
ARABIE,
BOORMAN,
AND
LEVITT
Blockmodel. (1) For one or more square matrices describing multiple networks on the same population: a partition of the row (column) indices together with a procedure for coding the blocked submatrices determined from this partition (e.g., the zeroblock criterion, q.v., or cutoff density, q.v.). (2) F or one or more rectangular matrices whose rows and columns index two disjoint populations: two partitions, one for rows and one for columns, together with a procedure for coding the blocked submatrices jointly determined by these partitions. Bond.
See Oneblock.
Cutoff density. A parameter 01,associated with the procedure for coding blockmodel images, q.v., in which blocked submatrices with density
or are coded “1.” Here OLis a parameter imposed at the discretion of the investigator (see also Boorman & White, 1976, p. 1425). Density. For a binary submatrix, the density is simply the number of unities divided by the number of entries, and thus has a range [O, 11. For a nonbinary matrix, there are two possibilities: (1) the matrix is treated as if it were Boolean, and calculation of density proceeds as just described, or (2) the nonzero entries are summed as is, and divided by the number of entries. In the latter case, the density may exceed 1.O. In any case, whenever the density is calculated, all nonzero entries should possess the same sign, SO as to avoid the possible inference of zero density spuriously indicative of the presence of a zeroblock. Generator.
See Type of tie, q.v.
Homomorphism (graph). A mapping from a blocked matrix to an image matrix, q.v., which maps oneblocks, q.v., to unities, and zeroblocks, q.v., to 0’s. See Berge (1962) for an equivalent formulation in graph-theoretic terminology. [N.B.: “Homomorphism” in this sense is not to be confused with the purely algebraic concept of a semigroup homomorphism developed in Boorman & White (1976). See Lorrain (1975) for a general formalism subsuming both concepts.] Image, image matrix. A matrix obtained from a blocked data matrix by replacing its blocked submatrices by either O’s or l’s according to some specified cutoff density. (Thus the O’s and l’s in the image matrix generally replace blocks having different dimensions.) Oneblock. A submatrix of a blocked matrix which is not declared as a zeroblock, q.v. (and therefore contains at least one nonzero entry). Stack. A procedure for preparing multiple relational data as CONCOR input, by which k square data matrices, each of dimensions n x n (with rows and columns depicting the same actors in identical orderings), are storied on top of one another to form a single new matrix of dimensions Jzn x n. CONCOR is then applied to the columns of the stacked matrix, so that the order of entering the original matrices into the stack is immaterial. See also Section 3B text.
51
BLOCKMODELS
Symmetric matrix. A square matrix M = [mijlnxn (e.g., Ms) is symmetric if, for all i and j, rntj = mji . This is evidently a criterion separately applicable to all pairs of * i ), and data matriceswhich are not fully symmetric will thus often contain entries (i, i), (3, somesymmetric pairs, rnii = mji . For this reason, it is frequently convenient to decomposea binary data matrix M,, into a symmetric part (mij = rnii,) and an asymmetric part (& # mii), with M = M’ + M” and mlrni: = 0 [i.e., M’ and M” are disjoint and their sum (= Boolean union) is M]. Tie. A condition or relation (e.g., “alliance,” “hostility,” or “net gain”) reported or inferred, and directed from the actor in row i toward the actor in column j. Properties of reflexivity, symmetry, or transitivity do not necessarilyobtain. Type of tie. A particular tie where multiple networks are present, normally coded as a Boolean matrix or directed graph (if the tie is binary), or as a real-valued matrix (otherwise). Zeroblock.
A blocked submatrix having density
Zeroblock criterion. A systematic rule which codes a blocked submatrix as a “0” in the imagematrix, q.v., only if this submatrix is strictly zero (so that the coding procedure correspondsto choosingcutoff density 01= 0.0, q.v.).
ACKNOWLEDGMENTS We are indebted to Lawrence J. Hubert for valuable technical comments. We also thank Auke Tellegen, William Batchelder, Ronald Breiger, J. Douglas Carroll, Clyde Coombs, Stephen E. Fienberg, Paul W. Holland, Eric Holman, JosephB. Kruskal, JosephSchwartz, and Harrison C. White for feedback on earlier versions of this paper. Valuable technical support for the research reported here was furnished by Dan C. Knutson and Dan Nichols, and the staff of the Bell Laboratories Technical Library. Additional feedback on related technical and interpretive aspects of blockmodeling was given by John Delany, William Panning, and Tony Boardman. Financial support for the present paper was obtained from NSF Grants GS-2689, SOC76-24512, and SOC76-24394, and funds from the Graduate School of the University of Minnesota. Much of the work reported here was completed while the first author was a Resident Visitor at Bell Laboratories, Murray Hill, New Jersey. Facilities for part of this research were generously provided by Professor Arthur Dempster of the Harvard Statistics Department under NSF Grant MCS75-01493.
REFERENCES ABELSON,
R. P.. & ROSENBERG,
M.
J. Symbolic
psycho-logic:
Behavioral Science, 1958,3, 1-13. Acz&, J., & DAROCZY, Z. On measures of information Academic
Press,
A model
of attitudinal
and their characterizations.
cognition.
New York:
1975.
’ Copies of these algorithms Harrison C. White, Department
are available, of Sociology,
with documentation, Harvard University,
upon request from: Professor Cambridge, Mass. 02138.
58
ARABIE,
BOORMAN,
AND
LEVITT
Al-van% F. L. Computational complexity of operations involving perfect elimination sparse matrices. International Journal of Computer Mathematics B, 1977, 6, 69-82. hno, A., FISHER, F. M., & SIMON, H. A. Essays on the structure of social science models. Cambridge, Mass.: MIT Press, 1963. ARABIE, P. Concerning Monte Carlo evaluations of nonmetric multidimensional scaling algorithms. Psychometrika, 1973, 38, 607-608. ARABIE, P. Clustering representations of group overlap. Journal of Mathematical Sociology, 1977, 5, 113-128. AUSTIN, M. P., & GREIG-SMITH, P. The application of quantitative methods to vegetation survey, II. Journal of Ecology, 1968, 56, 827-844. BADIAN, E. Foreign clientelae (264-70 B.C.). Oxford: Oxford University Press (Clarendon), 1958. BAKER, F. B., & HUBERT, L. J. A graph-theoretic approach to goodness-of-fit in complete-link hierarchical clustering. Journal of the American Statistical Association, 1976, 71, 870-878. BARKER, V. A., Ed. Sparse matrix techniques, Copenhagen 1976. Lecture Notes in Mathematics, No. 572. Berlin: Springer-Verlag, 1977. BATCHELDER, W. H., & NARENS, L. A. A critical examination of the analysis of dichotomous data. Philosophy of Science, 1977, 44, 113-135. BERGE, C. The theory of graphs and its applications. New York: Wiley, 1962. BERNARD, P. Association and hierarchy: The social structure of the adolescent society. Unpublished doctoral dissertation, Department of Sociology, Harvard University, 1974. BEUM, C. O., & BRIJNDAGE, E. G. A method for analyzing the sociomatrix. Sociometry, 1950, 13, 141-145. BEYLE, H. C. Identification and analysis of attribute-cluster-blocs. Chicago: University of Chicago Press, 1931. BIRKHOFF, G. Lattice theory. Revised edition. Providence, R.I.: American Mathematical Society, 1967. BLOCH, M. Seigneurie franGuise et manoir anglaise. Paris: Armand Colin, 1960. BOISSEVAIN, J. Friends of friends: Networks, manipulators, and coalitions. Oxford: Blackwell, 1974. BOORMAN, S. A. Metric spaces of complex objects. Unpublished honors thesis, Applied Mathematics, Harvard University, 1970. BOORMAN, S. A. A combinatorial optimization model for transmission of job information through contact networks. Bell Journal of Economics, 1975, 6, 216-249. BOORMAN, S. A. Informational optima in a formal hierarchy: Calculations using the semigroup. Journal of Mathematical Sociology, 1977, 5, 129-147. BOORMAN, S. A., & ARAEZIE, P. Structural measures and the method of sorting. In R. N. Shepard, A. K. Rommey, & S. B. Nerlove (Eds.), Multidimensional scaling: Theory and applications in the behavioral sciences, Vol. 1: Theory. New York: Seminar Press, 1972. BOORMAN, S. A., & OLIVIW, D. C. Metrics on spaces of finite trees. Journal of Mathematical Psychology, 1973, 10, 26-59. BOORMAN, S. A., & WHITE, H. C. Social structure from multiple networks. II. Role structures. American Journal of Sociology, 1976, 81, 1384-1446. BREIGER, R. L. The duality of persons and groups. Social Forces, 1974, 53, 181-190. BREIGER, R. L. Career attributes and network structure: A blockmodel study of a biomedical research specialty. American Sociological Review, 1976, 41, 117-l 35. BREIGJIR, R. L., Toward an operational theory of community elite structure. Cambridge, Mass./ New Haven, Conn.: Harvard-Yale Preprints in Mathematical Sociology, No. 6, August, 1977. BREIGW, R. L., BOORMAN, S. A., & ARAEZE, P. An algorithm for clustering relational data, with applications to social network analysis and comparison with multidimensional scaling. Journal of Mathematical Psychology, 1975, 12, 328-383. BUNCH, J. R. Complexity of sparse elimination. In J. F. Traub (Ed.), Complexity of sequential and parallel numerical algorithms. New York: Academic Press, 1973. BUNCH, J. R., 8s ROSE, D. J. Sparse matrix computations. New York: Academic Press, 1976.
BLOCKMODELS CARROLL,
59
J. D. Spatial, nonspatial and hybrid models for scaling. Psychometrika, 1976,41,439-463. J. D., & PRUZANSKY, S. Fitting of hierarchical tree structure (HTS) models, mixtures of HTS models, and hybrid models, via mathematical programming and alternating least squares. Paper presented at the United States-Japan Seminar on Multidimensional Scaling, University of California at San Diego, LaJolla, California, August 20-24, 1975. CARROLL, J. D., & WISH, M. Models and methods for three-way multidimensional scaling. In D. H. Krantz, R. C. Atkinson, R. D. Lute, & P. Suppes (Eds.), Contemporary developments in mathematical psychology, Vol. II. San Francisco: Freeman, 1974. CLIFFORD, A. H., & PRESTON, G. B. The algebraic theory of semigroups. Providence, R.I.: American Mathematical Society, 196 1. COLEMAN, J. S., & MACRAE, D. Electronic processing of sociometric data for groups up to 1,000 in size. American Sociological Review, 1960, 25, 722-727. COHN, P. M. Universal algebra. New York: Harper & Row, 1965. COO~MBS, C. H., & SMITH, J. E. K. On the detection of structure and developmental processes. Psychological l&&w, 1973, 80, 337-351. CHOWELL, R. H., & Fox, R. H. Introduction to knot theory. Boston: Ginn, 1963. D’ANDRADE, R. G., QUINN, N. R., NERLOVE, S. B., & ROMNEY, A. K. 1972. Categories of disease in American-English and Mexican-Spanish. In A. K. Romney, R. N. Shepard, & S. B. Nerlove (Eds.), Multidimensional scaling: Theory and applications in the behavioral sciences, Vol. 2: Applications. New York: Seminar Press, 1972. DAVIS, J. A. Clustering and hierarchy in interpersonal relations: Testing two graph theoretical models on 742, sociomatrices. American Sociological Review, 1970, 35, 843-852. DAVIS, J. A., & LEINHARDT, S. The structure of positive interpersonal relations in small groups. In J. Berger, M. Zelditch, & B. Anderson (Eds.), Sociological theories in progress, Vol. 2. Boston: Houghton-Mifflin, 1972. DELANY, J. L. .4 simulation of social networks describing job information fiow. Unpublished, Department of Sociology, Yale University, 1976. DEUTSCH, S. B., & MARTIN, J. J. An ordering algorithm for analysis of data arrays. Operations Research, 1971, 19, 1350-1362. DEVLIN, S. J., GNADADESIKAN, R., & KETTENRINC, J. R. 1975. Robust estimation of covariance and correlation matrices. Unpublished manuscript, Bell Telephone Laboratories, Murray Hill, N. J. DORFMAN, R., SAMUELSON, P. A., & SOLOW, R. M. Linear programming and economic anelysis. New York: McGraw-Hill, 1957. DWAW, -4.. & FALMACNE, J. C. Composite measurement. Journal of Mathematical Psychology, 1969, 6, 359-390. DC.FF, I. S. A survey of sparse matrix research. Proceedings of the IEEE, 1977, 65, 500-535. FRIEDELL, M. F. Notes on cognitive structure. Unpublished M..4. thesis, Department of Sociology, University of Chicago, 1962. GELFAND, A. E. Seriation methods for archaeological materials. American Antiquity, 197 1,36,263-274. GIACOMELLI, F., WIENER, J., KRUSKAL, J. B., POMERANZ, J. V., & LOUD, A. V. Subpopulations of blood lymphocytes demonstrated by quantitative cytochemistry. Jorcrnal of Histochemistry and Cytochemistry, 1971, 19, 426-433. GOODALL, D. W. Objective methods for the classification of vegetation. I. The use of positive interspecific correlation. Australian Journal of Botany, 1953, 1, 39-63. GRIFFITH, B. C., MAIER, V. L., & MILLER, A. J. Describing communications networks through the use of matrix-based measures. Unpublished, Graduate School of Library Science, Drexel Universiv, 1973. GRUVAEW, G., & WAINER, H. Two additions to hierarchical cluster analysis. British Journal of Mathematical and Statistical Psychology, 1972, 25, 200-206. HARARY, F., NORMAN, R. Z., & CARTWRIGHT, D. Structural models: An introductiorz to the theory of directed graphs. New York: Wiley, 1965. HXRTIGAN, J. &4. Clustering algorithms. New York: Wiley, 197.5.
CARROLL,
60
ARABIE,
BOOBMAN,
AND
LEVITT
HE% G. H., 8~ WHITE, H. C. An algorithm for finding simultaneous homomorphic correspondences between graphs and their image graphs. B&tic& Science, 1976, 21, 26-35. HILL, M. 0. Correspondence analysis: A neglected multivariate method. Journal of the Royal Statistical Society C, Applied Statistics, 1974, 23, 340-354. HODSON. F. R., KENDALL, D. G., & T~UTU, P. (Eds.). Mathematics in the archaeological and historical sciences. Edinburgh: Edinburgh Univ. Press, 1971, HOLLAND, P. W. Analyzing sociometric data. In B. B. Wolman (Ed.), Interm&& encyclopedia of neurology, psychiatry, psychoanalysis and psychology. New York: Van Nostrand, 1977. HOLLAND, P. W., & LEINHARDT, S. A method for detecting structure in sociome& data. Am,&can Journal of Sociology, 1970, 16, 492-513. HOLLAND, P. W., & LEINHARDT, S. The structural implications of measurement error in sociometry. Journal of Mathematical Sociology, 1973, 3, 85-I 11. HOLLAND, P. W., & LEINHARDT, S. The statistical analysis of local structure in social networks. Cambridge, Mass.: National Bureau of Economic Research, Working Paper 44, 1974. HOLLAND, P. W., & LEINHARDT, S. A dynamic model for social networks. Journal of Mathematical Sociology, 1977, 5, S-20. HOMANS, G. C. The human group. New York: Harcourt, Brace, 1950. HUBERT, L. J. Monotone invariant clustering procedures. Psyckometrika, 1973, 38, 47-62. HUBERT, L. J. Problems of seriation using a subject by item response matrix. Psychological Bulletin, 1974a, 81, 976-983. HWERT, L. J. Spanning trees and aspects of clustering. British Journal of Mathematical and Statistical Psychology, 1974b, 22, 14-28. HUBERT, L. J. Some applications of graph theory to clustering. Psychometrika, 1974c, 39, 283-309. HUBERT, L. J. Some applications of graph theory and related non-metric techniques to problems of approximate seriation: The case of symmetric proximity measures. British Journal of Mathematical rind Statistical Psychology, 1974d, 27, 133-l 53. HUBERT, L. J. Seriation using asymmetric proximity measures. British Journal of Mathematical and Statistical Psychology, 1976, 29, 32-52. HUBERT, L. J., & BAKER, F. B. Evaluating the conformity of sociometric measurements. Psychometrika, 1978, 43, in press. JENNINGS, A. A sparse matrix scheme for the computer analysis of structures. International Journal of Computer Mathematics, 1968, 2, 1-21. JOHNSON, S. C. Hierarchical clustering schemes. Psychometrika, 1967, 32, 241-254. KATZ, L. On the matric analysis of sociometric data. Sociometry, 1947, 10, 233-241. KELLER, J. B. Comment on “Channels of Communication in Small Groups.” American Sociological Rewiew, 1951, 16, 842-843. -ALL, D. G. Incidence matrices, interval graphs, and seriation in archaeology. Pacific Journal of Mathematics, 1969, 28, 565-570. KENDALL, D. G. A mathematical approach to seriation. Philosophical Transactions of the Royal Society of London A, 1970,269, 125-l 35. KENDALL, D. G. Abundance matrices and seriation in archaeology. Zeitschrift fiir Wahrscheinlichkitstheorie und Verwnndte Gebiete, 1971, 17, 104-112. KOOPMANS, T. C. Three essays on the state of economic science. New York: McGraw-Hill, 1957. KRUSKAL, J. B. Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika, 1964a, 29, l-27. KRUSKAL, J. B. Nonmetric multidimensional scaling: A numerical method. Psychometrika, 1964b, 29, 28-42. KRUSKAL, J. B. Linear transformation of multivariate data to reveal clustering. In R. N. Shepard, A. K. Romney, & S. B. Nerlove (Eds.), Multidimensional scaling: Theory and applications in the behaoiora2 sciences, Vol. 1. New York: Seminar Press, 1972. KUPERSHTOKH, V. L., & MIRKIN, B. G. Ordering of interrelated objects I, 11. Automation and Remote Control [Avtomatika i Telemekhanika], 1971, 32, 924-929; 1093-1098.
BLOCKMODELS
61
J. M., & WILLIAMS, W. T. Multivariate methods in plant ecology. IV. Nodal analysis. of Ecology, 1%2,50, 775-802. LANCE, G. N., & WILLIAMS, W. T. A general theory of classificatory sorting strategies. I. Hierarchical systems. Computer Journal, 1967, 9, 373-380. (a) LANCE, G. N., & WILLIAMS, W. T. A general theory of classificatory sorting strategies. I. Hierarchical systems. Computer ]ournaZ, 1967, 10, 276-277. (b) LANDAU, H. G. On dominance relations and the structure of animal societies. Bulletin oj i&themutical Biophysics, 1951, 13, 1-19. LANDAU, H. G. Development of structure in a society with a dominance relation when new members are added successively. Bulletin of Mathematical Biophysics, 1965, 27 (Special Issue), 15 I - I 60. LANDAU, J., & DE LA VEGA, F. A new seriation algorithm applied to European protohistoric anthropomorphic statuary. In F. R. Hodson, D. G. Kendall, & P. Tautu (Eds.), Mathematics in the archaeological and historical sciences. Edinburgh: Edinburgh University Press, 1971. LEIJONHUFVUD, A. On Keynesian economics and the economics ofKeynes. New York: Oxford University Press, 1968. LEIK, R. K., & MEEKER, B. F. Mathematical sociology. Englewood Cliffs, N. J.: Prentice-Hall, 1975. LENSTRA, J. K. Clustering a data array and the traveling-salesman problem. Operations Research, LAMBERT, Jotunat
1974,
22,
413-414.
LENSTRA, J. K., & RINNOOY KAN, A. H. G. Some simple applications of the traveling-salesman problem. Operational Research QuarterZy, 1975, 26, 717-733. LE~ELT, W. J. M., VAN DER GEER, J. P., & PLOMP, R. Triadic comparisons of musical intervals. British ]ournal of Mathematical and Statistical Psychology, 1966, 19, 163-179. LEVINE, J. H. The sphere of influence. American Sociological Review, 1972, 37, 14-27. LIN, T. D., & MAH, R. S. H. Hierarchical partition-a new optimal pivoting algorithm. Muthematical Programming, 1977, 12, 260-278. LING, R. F. An exact probability distribution on the connectivity of random graphs. Jortrnal of Mathematical Psychology, 1975, 12, 90-98. LINGOES, J. C., & COOPER, T. Probability evaluated partitions. I. Behavioral Science, 1971, 16, 259-261.
LORRAIN, F. P. Handbook of two-block two-generator models. Unpublished, Department of Sociology, Harvard University, 1973. LORRAIN, F. P. Re?seuux sociaux et clussi$cations so&ales. Paris: Hermann, 1975. (English language version: Doctoral dissertation, Department of Sociology, Harvard University, 1972.) LORRAIN, F. P., & WHITE, H. C. Structural equivalence of individuals in social networks. Journal of Mathematical Sociology, 1971, 1, 49-80. LUCE, R. D. Connectivity and generalized cliques in sociometric group structure. Psychometrika, 1950, 15, 169-190. LICE, R. D., & GALANTER, E. Psychophysical scaling. In R. D. Lute, R. R. Bush, & E. Galanter (Eds.), Handbook of mathemutical psychoLogy, Vol. 1. New York: Wiley, 1963. MCCONAGHY, M. J. Maximum possible error in Guttman scales. Public Opinion Quarterly, 1975, 34,
343-357.
MCCORMICK, W. T., JR., SCHWEITZER, P. J., & WHITE, T. W. Problem decomposition and data reorganization by a clustering technique. Operations Research, 1972, 20, 993-1009. MCQUITTY, L. L. Multiple clusters, types, and dimensions from interactive intercolumnar correlation analysis. Multivariate Behavioral Research, 1968, 3, 465-477. MCQUITTY, L. L., & CLARK. J. A. Clusters from iterative, intercolumnar correlational analysis. Educational and Psychological Measurement, 1968, 28, 211-238. MIRKIN, B. G., & CHERNYI, L. B. Measurement of the distance between distinct partitions of a finite set of objects. Automation and Remote Control fAvtomatika i Telemekhanika], 1970. I, 786-792. MULLINS, N. C., HARGENS, L. L.. HECHT, P. R., & KICK, E. L. The group structures of two scientific specialties: A comparative study. American Sociological Review, 1977, 42, 552-562.
62 NEEDHAM,
ABABIE,
BOORMAN,
AND
LEVITT
R. M. Applications of the theory of clumps. Mechanical Translation, 1965, 8, 113-127. T. M. The acquaintance process. New York: Holt, Rinehart and Winston, 1961. NORDLIE, P. B. A longitudinal study of interpersonal attraction in a natural group setting. Ann Arbor, Mich.: University Microfilms, 1958. No. 58-7775. ORE, 0. Theory of equivalence relations. Duke Mathematical Journal, 1942, 9, 573-627. PETRIE, W. M. F. Sequence in prehistoric remains. Journal of the Anthropologid In&t& of Great Britain and Ireland (N.S.), 1899, 29, 295-301. PORSCHING, T. A. On the origins and numerical solution of some sparse nonlinear systems. In J. R. Bunch & D. J. Rose (Eds.), Sparse matrix computations. New York: Academic Press, 1976. REID, J. K. (Ed.). Large sparse sets of linear equations. New York: Academic Press, 1971. RICE, S. A. The identification of blocs in small political bodies. American Political Science Review, 1927, 21, 619-627. ROBINSON, A. Introduction to model theory and to the mathematics of algebra. Amsterdam: NorthHolland, 1965. ROSE, D. J., & WILLOUGHBY, R. A. (Eds.). Sparse matrices and their applications. New York: Plenum, 1972. RUHE, A. Computation of eigenvalues and eigenvectors. In V. A. Barker (Ed.), Sparse matrix techniques, Copenhagen 1976. Lecture Notes in Mathematics, No. 572. Berlin: Springer-Verlag, 1977. SALTON, G. Automatic information organization and ret&val. New York: McGraw-Hill, 1968. SAMPSON, S. F. Crisis in a cloister. Ann Arbor, Mich.: University Microfilms, 1969. No. 69-5775. SCARF, H. The approximation of fixed points of a continuous mapping. SIAM Journal of Applied Mathematics, 1967, 15, 1328-l 343. SCHWARTZ, J. E. An examination of iterative blocking algorithms for sociometry. In D. R. Heise (Ed.), Sociological methodology, 1977. San Francisco: Jossey-Bass, 1976. SHAFTO, M. Cluster analysis by linear contrasts. Research Bulletin RB-72-35, Educational Testing Service, Princeton, N. J., August, 1972. SHEPARD, R. N. The analysis of proximities: Multidimensional scaling with an unknown distance function. I. Psychometrika, 1962a, 27, 125-140. SHEPARD, R. N. The analysis of proximities: Multidimensional scaling with an unknown distance function. II. Psychometrika, 1962b, 27, 219-246. SHEPARD, R. N. A taxonomy of some principal types of data and of multidimensional methods for their analysis. In R. N. Shepard, A. K. Romney, & S. B. Nerlove (Eds.), Multidimensiomd scaling: Theory and applications in the behavioral sciences, Vol. 1: Theory. New York: Seminar Press, 1972. SHEPARD, R. N. Representation of structure in similarity data: Problems and prospects. Psychometrika, 1974, 39, 373-421. SHEPARD, R. N., & ARABIE, P. Additive cluster analysis of similarity data. Paper presented at the United States-Japan Seminar on Multidimensional Scaling, University of California at San Diego, La Jolla, California, August 20-24, 1975. SHEPARD, R. N., & CARROLL, J. D. Parametric representation of nonlinear data structures. In P. R. Krishnaiah (Ed.), M&variate analysis II. New York: Academic Press, 1966. SLOBODKIN, L. B. Growth and regulation of animalpopuZations. New York: Holt, Rinehart & Winston, 1961. STEVENS, S. S. Issues in psychophysical measurement. PsychologicaI Review, 1971, 78, 426-450. STFWARD, D. V. Partitioning and tearing systems of equations. SIAM Journal of Numerical Analysis B, 1965, 2, 345-365. TEWARSON, R. P. Sorting and ordering sparse linear systems. In J. K. Reid (Ed.), Large sparse sets of linear equations. New York: Academic Press, 1971. TEWARSON, R. P. Sparse matrices. New York: Academic Press, 1973. THEIL, H. Economics and information theory. Amsterdam: North-Holland, 1967. TRAVERS, J.. & MILGRAM, S. An experimental study of the “small world problem.” Sociometry, 1969, 32, 425-443. NEWCOMB,
63
BLOCKMODELS
WARD,
J. H.,
JR. Hierarchical
grouping
to optimize
an objective
function.
]o~tmaZ
of the Americarr
production
systems.
Economettica, 1968, 36,
Statistical Association, 1963, 58, 236-244. R. L., 260-270. WEIL, R. L., (and other) WHITE, H. C. 67, 185-199. WHITE, H. C. WEIL,
JR. The
decomposition
of economic
JR., & KETTLER, P. C. Rearranging matrices to block-angular algorithms. Manage-t Science, 1971, l&98-108. Management conflict and sociometric structure. American
form
for decomposition
]o~maZ of Sociology, 1961,
uses of mathematics in sociology. In J. C, CharIesworth (Ed.), Mathematics and Monograph, Annals of the American Academy of Political and Social Science, 1963. WHITE, H. C. Chains of opportunity. Cambridge, Mass.: Harvard Univ. Press, 1970. WHITE, H. C. Do networks matter ? Unpublished, Department of Sociology, Harvard University, 1972. [Paper presented at Mathematical Social Science Board Advanced Research Workshop on Social Networks, Camden, Maine, June, I972.1 WHITE, H. C. Probabilities of homomorphic mappings from multiple graphs. &rurrrul of Muthemtical Psychology, 1977, 16, 121-134. WHITE, H. C., BOORMAN, S. A., & BREICER, R. L. Social structure from multiple networks. I. Blockmodels of roles and positions. Ant&can Jaurnal of Sociology,1976, 81, 730-780. WILKINSON. E. M. Archaeological seriation and the travelling salesman problem. In F. R. Hodson, D. G. Kendall, & P. Tautu (Eds.), Mathematics in the archaeologica and historical sciences. Edinburgh: Edinburgh University Press, 1971. WILKINSON, J. H. Rounding errors in algebraic processes. Notes on Applied Science, No. 32. London: Her Majesty‘s Stationery Oflice, 1963. WILKINSON, J. H. The algebraic eigemxdue pa&m. London: Oxford Univ. Press, 1965. WILLIAMSON, 0. E. Markets and hierarchies: Analysis and antitrwt implications. New York: Free Press, 1975. The
the social sciences. Philadelphia:
RECEIVED:
48o/r7/r-5
February
11, 1977