Introducing exponential random graph models for visibility networks

Introducing exponential random graph models for visibility networks

Journal of Archaeological Science 49 (2014) 442e454 Contents lists available at ScienceDirect Journal of Archaeological Science journal homepage: ht...

2MB Sizes 2 Downloads 109 Views

Journal of Archaeological Science 49 (2014) 442e454

Contents lists available at ScienceDirect

Journal of Archaeological Science journal homepage: http://www.elsevier.com/locate/jas

Introducing exponential random graph models for visibility networks Tom Brughmans*, Simon Keay, Graeme Earl Archaeological Computing Research Group, Department of Archaeology, University of Southampton, Avenue Campus, Highfield, Southampton SO17 1BF, United Kingdom

a r t i c l e i n f o

a b s t r a c t

Article history: Received 1 August 2013 Received in revised form 16 May 2014 Accepted 19 May 2014 Available online xxx

Archaeological network analysts often represent archaeological data as static networks and explore their structure. However, most networks changed through time and static network representations do not allow archaeologists to test assumptions about the dynamic processes driving this change. The study of visibility networks in archaeology is a good example of this. Archaeologists propose hypotheses of the role of lines of sight between settlements, which imply dynamic processes for the establishment of the observed visibility networks. However, commonly used methods do not allow us to evaluate these hypotheses. In this paper we introduce exponential random graph modelling (ERGM) as a method for bridging static and dynamic approaches to interpreting visibility networks. This method offers a number of advantages: (1) it explicitly addresses the assumptions inherent in visibility network creation about what relationships between nodes mean and the types of processes they allow for; (2) it allows one to investigate the range of network structures that these assumptions give rise to; and (3) it explores the dynamic processes that might have led to observed networks. This method is used to evaluate hypotheses of the role of lines of sight in facilitating visual control and communication during the later Iron Age in Southern Spain. This study shows that ERGMs can be used as a reflective technique to evaluate competing hypotheses, and that ERGM results subsequently require more contextualised evaluation. Future work on ERGMs should focus on incorporating geographical constraints to further enhance its potential for studying visibility networks. © 2014 Elsevier Ltd. All rights reserved.

Keywords: Exponential random graph models ERGM Network analysis Graph theory Visibility analysis Iron Age Spain

1. Introduction1 In this paper we introduce Exponential random graph modelling (ERGM) as a method for formally expressing and testing the assumptions archaeologists formulate about the dynamic processes giving rise to visibility networks. ERGM was originally developed for formulating hypotheses about social processes that might have produced empirically observed social networks, but this approach has never before been applied in an archaeological context or used for studying visibility networks. We believe ERGM has great potential for making the theoretical assumptions about dynamic processes inherent in many archaeological networks explicit. This paper aims to explore the potential of ERGM for the study of visibility networks in archaeology. We will use the example of intersettlement visibility networks to illustrate the key concepts of

* Corresponding author. E-mail addresses: [email protected] (T. Brughmans), [email protected]. uk (S. Keay), [email protected] (G. Earl). 1 The following abbreviations are used: exponential random graph model(ling) (ERGM), social network analysis (SNA), digital elevation model (DEM), root mean square error (RMSE). http://dx.doi.org/10.1016/j.jas.2014.05.027 0305-4403/© 2014 Elsevier Ltd. All rights reserved.

ERGM. In section two of this paper we will show that it is common practice for archaeological network analysts to formulate assumptions about the dynamic processes behind the networks they study. We believe that postulating the existence of these processes purely based on exploratory network analysis is problematic and that, where possible, a statistical method is needed to link empirically observed networks with assumptions of dynamic past processes. In the third section we will describe the technical details of ERGM and introduce the method of creating ERGMs for visibility networks. In the fourth section we illustrate this method by presenting a simple case study. In the case study we evaluate hypotheses of the role of lines of sight in facilitating visual control and communication in Iron Age II Southern Spain. This is followed by a discussion of the advantages and issues of this method for the study of visibility networks, and recommendations for future methodological development of ERGM. 2. Dependence assumptions: dynamic processes in archaeological networks Network representations of archaeological data are often used as static snapshots conflating an ever-changing dynamic past

T. Brughmans et al. / Journal of Archaeological Science 49 (2014) 442e454

(e.g. Brughmans, 2010; Golitko et al., 2012). By performing an exploratory network analysis we get an idea of their structure during a given period of time. Such an approach can be considered a type of exploratory data analysis. However, archaeologists interpret these networks as representations of the past phenomena that we are ultimately interested in understanding. Given that most past phenomena involve change through time, it is entirely plausible that at an earlier or later stage in time a given network could have had a different structure. Exploratory network techniques can describe and represent these different stages, but they are very poor at evaluating the processes driving these changes. A commonly used technique for archaeologists to overcome this problem is to formulate theoretical assumptions about how past phenomena changed over time through the emergence or disappearance of relationships between pairs of nodes in their data networks (from here-on referred to as dependence assumptions). Such dependence assumptions are frequently accompanied by (explicitly formulated or implied) expectations of the kinds of network patterns the assumptions give rise to. In other words, archaeologists frequently make theoretical statements about dynamic processes that cause change in past phenomena, formulate how they can be represented as network data patterns, and subsequently identify these specific patterns in networks of archaeological data. When discussing the social processes that caused a network to change from one state to another, archaeological network analysts have so far relied on the identification in an observed network's static structure of the patterns considered to be the typical outcomes of hypothetical processes. We therefore rarely evaluate whether these dynamic processes can actually give rise to the networks we study, nor do we consider the effect multiple dependence assumptions in combination can have on the structure of networks. There is a need for a method that allows archaeologists to do overcome this problem, and the current paper presents such a method for the study of visibility networks. The study of visibility networks in archaeology (e.g. Davidson, , 1979; De Montis and Caschili, 2012; Fraser, 1983; Ruestes Bitria 2008; Shemming and Briggs 2014; Swanson, 2003; Tilley, 1994) serves as a particularly good example of how archaeological network analysts typically study processes of network creation. In visibility networks, entities of research interest with a certain spatial location such as burial mounds, megaliths, or settlements (Iron Age II settlements in the example presented in this paper) are represented as nodes. A pair of nodes A and B is connected by a directed relationship (here referred to as an arc) if an observer standing at the location of node A can see the location of node B, i.e. a line of sight connects both locations (Fig. 1). Underlying the archaeological use of visibility networks are the assumptions that lines of sight could have been intentionally created to structure the surrounding space, and that the study of these lines of sight might reveal aspects of how they structured space and what it meant to past peoples. Wheatley and Gillings (2000, 3), for example, defined the term visibility as “past cognitive/perceptual acts that served to not only inform, structure, and organise the location and form of cultural features, but also to choreograph practice within and around them.” Llobera (2003, 2007) similarly emphasises the role of visibility patterns in structuring space through the intentional positioning of physical features in the landscape. It is up to the archaeologist to decipher if and how this structuring was achieved in order to identify exactly which patterns were intentionally created, and most importantly to try to understand the role lines of sight played in the past. Archaeologists have used visibility networks as a method for studying the role lines of sight could have had in structuring past

443

Fig. 1. (a) An observer located at site A can see site B, and vice versa. The lines of sight connecting these two sites can be represented as a visibility network (b) where nodes represent sites and arcs represent lines of sight.

human behaviour, for example through communication networks using fire or smoke signalling, or the visual control settlements exercise over the surrounding landscape and settlements. Formulating dependence assumptions for visibility networks implies a sequence of events where new lines of sight will be established as a reaction to pre-existing lines of sight. For example, if we observe that a settlement is positioned in a visually prominent location from where many other settlements can be seen, more so than any of the surrounding settlements, then we might formulate the hypothesis that this location was intentionally selected to enhance communication with or visual control over neighbouring settlements. Similarly, if an effective signalling network was considered during selection of the location for a new settlement, then settlement locations inter-visible with other settlements creating a chain of inter-visible settlements would have been preferred. However, archaeological network analysts have so far studied these processes exclusively through an analysis of static network representations. By pointing out the patterns of interest, an exploratory network analysis can only take us so far to evaluate our dependence assumptions, leaving hypotheses surrounding the intentional creation of visibility patterns untested. A good example of this is Tilley's (1994) study of a network of inter-visibility between barrows on Cranborne Chase: “One explanation for this pattern might be that sites that were particularly important in the prehistoric landscape and highly visible ‘attracted’ other barrows through time, and sites built later elsewhere were deliberately sited so as to be intervisible with one or more other barrows. In this manner the construction of barrows on Cranborne Chase gradually created a series of visual pathways and nodal points in the landscape” (Tilley, 1994, 159). This quote shows how Tilley interprets an observed network pattern as the intentionally established outcome of an untested process of locating barrows at locations inter-visible with one or more other barrows. In order to overcome this problem, a statistical approach is needed that succeeds in expressing dependence assumptions and simulating the network patterns these assumptions give rise to, so that we can compare these simulated patterns with the observed visibility networks. In this paper we argue that exponential random graph modelling (ERGM) is such a method. ERGM offers a number of advantages: (1) it explicitly addresses the assumptions inherent in visibility network creation about what arcs between nodes mean and the types of processes they allow for; (2) it allows one to investigate the different network structures that particular assumptions give rise to; and (3) it allows one to explore the dynamic processes that might have led to observed networks. The next section will introduce the key concepts of ERGM.

444

T. Brughmans et al. / Journal of Archaeological Science 49 (2014) 442e454

3. Exponential random graph modelling 3.1. Definition ERGMs2 belong to a family of statistical models originally developed for social networks (Anderson et al., 1999; Wasserman and Pattison, 1996) that aim to investigate the dependence assumptions underpinning hypotheses of network formation by comparing the frequency of particular configurations in observed networks with their frequency in stochastic models. The terms in this definition in italics have a specific meaning in network science and will be defined below. As mentioned above, dependence assumptions are theoretical assumptions grounded in the idea that pairs of nodes do not just become connected independent of what happens in the rest of the network: “The presence of some ties will encourage other ties to come into existence, to be maintained, or to be destroyed” (Robins, 2011: 485). These assumptions, therefore, reflect the researcher's theories of how ties emerge relative to their position in the network. A very common dependence assumption in social networks is “a friend of my friend is my friend”, or more formally, a pair of nodes which have a mutual neighbour in the network have a tendency to become neighbours themselves (often referred to as “transitivity”, Fig. 2aeb). A common dependence assumption for visibility networks in archaeology is that when an observer can see settlement B from settlement A, it is likely that an observer can see settlement A from settlement B, i.e. an assumption of inter-visibility of settlements (Fig. 2ced). We use the term observed network here to refer to the network created on the basis of data collected by archaeologists. The researcher using ERGM is interested in modelling the observed network (Robins et al., 2007a, 175). In visibility networks this would typically be a set of nodes representing the observation locations (e.g. settlements) connected by a set of arcs representing lines of sight (Fig. 1). Visibility network data can either be collected by observations in the field, or more formally by using visibility analysis techniques in a GIS (Conolly and Lake 2006; Wheatly and Gillings, 2002). An ERGM aims to study the range of processes that could give rise to such networks, and it is therefore crucial for the observed visibility network to be as complete as possible if the ERGM is to suggest realistic processes. Moreover, the selection of the boundaries of the visibility network will need to be clearly argued for, and the impact these boundaries have on the results of the ERGM will need to be assessed when interpreting the results. In archaeology the observed visibility networks are often a pattern that aggregates evidence over a long timespan. This is less of a problem than the issue of having a complete network, in particular for relatively slow-changing processes such as settlement patterns, as long as the researcher is confident about the contemporaneity of the set of nodes that make up the observed network. For example in the case study presented here, although we consider a period of two centuries, the archaeological evidence suggests that all 159 settlements in the visibility network were occupied at the end of this period. Configurations3 are small network patterns consisting of a few nodes and the arcs between them (e.g. Fig. 3). They play a number of roles in the ERGM procedure: representing dependence assumptions, describing observed network structure, comparison

2 These are sometimes called p* models to distinguish them from the earlier p1 (Holland and Leinhardt, 1981) and p2 (Lazega and van Duijn, 1997) model classes. 3 The term configurations is used here following key publications in ERGM (e.g. Robins, 2011) and first used by Moreno and Jennings (1938), instead of the term motifs which recently became popular through the work of Milo et al. (2002).

Fig. 2. (aeb) Example of a social network where nodes are individuals and edges represent friendship. A pair of individuals with a mutual friend (a), has a tendency of becoming friends themselves (b). (ced) Example of a visibility network where nodes are settlements and arcs lines of sight between them. If an observer can see one settlement from another (c), it is likely that both settlements are inter-visible (d). (e) This network consists of 5 nodes, 7 arcs, and 2 reciprocity configurations.

with simulated networks, and as effects in the ERGM. Dependence assumptions can be formally represented by particular configurations. The transitivity dependence assumption in Fig. 2 could be suitably represented by a closed triangle (often called a transitive triad) because in it a pair of nodes with a mutual neighbour is connected by an arc, whilst inter-visibility could be represented by arcs in two directions (referred to as reciprocity). We can also describe an observed visibility network by counting the frequency of each configuration in the network. This provides a way of describing a visibility network's structure, but also allows comparison with the number of configurations of simulated networks. For example, the network shown in Fig. 2e consists of five nodes, seven arcs, and two reciprocity configurations. This information is used to determine how similar the networks simulated by a certain ERGM are to the observed network (for more detail see Section 3.5). When creating an ERGM researchers select those configurations to be included as effects in the model which they believe to be representations of their dependence assumptions. This means that the model will not let these particular configurations that are of research interest emerge purely by chance, but rather it will estimate whether there is a positive or negative tendency for each configuration to appear throughout the simulation process. For example, in the case study below we describe an ERGM which includes the reciprocity configuration as an effect, and the results indicate that in this model there is a tendency for settlements to be inter-visible. These configurations are assembled through a stochastic process: at each time step two randomly selected nodes are considered and an arc may be created or removed between them. The probability that an arc is created between these two nodes is determined by the effects in the model, and therefore by the presence or absence of other ties. To give the example of our case study, when in an ERGM with a strong inter-visibility effect a pair of nodes A and B is

T. Brughmans et al. / Journal of Archaeological Science 49 (2014) 442e454

445

Fig. 3. Configurations used in this model and the dependence assumptions which they represent.

considered that already have one arc from A to B, then the probability will be high that another arc from B to A is created in that time step. This stochastic simulation process is an implementation of the idea that the observed network is only one particular outcome out of a wide range of possible networks. We do not know what process generated the observed network, this is what we are trying to find out. But we do know the dependence assumptions we can formulate based on our theories. The goal of an ERGM is to draw on these assumptions to propose a plausible theoretical hypothesis for the process that led to the observed network (Robins et al., 2007a, 175). 3.2. Creation process of ERGMs for visibility networks Now that we have defined ERGM and its most important concepts, the next thing to do is explain step-by-step how an ERGM is created. In their general framework for ERGM construction Robins et al. (2007a) describe five steps followed when creating an ERGM.

By going through these steps archaeologists can test their theoretical decisions about how lines of sight are created through statistical data analysis. We will discuss these five steps and give examples relevant to the analysis of visibility networks. Fig. 4 offers a simplified overview of this design process. 1. Each arc can either be present or absent: we start with a fixed set of nodes that are unconnected, and assume that throughout the simulation every pair of nodes can either be connected or not. Although the arc is considered a random variable, some arcs will have a higher probability of appearing than others. For the visibility networks in our case study this means that the number of settlements remains the same throughout the simulation and that settlement A can either be seen from settlement B or not with a certain probability (see Section 3.3). 2. A dependence assumption is proposed: this is the most crucial step and concerns the explicit formulation of our dependence

446

T. Brughmans et al. / Journal of Archaeological Science 49 (2014) 442e454

Fig. 4. A simplified representation of the creation process of an ERGM. (1a) an empirically observed network is considered; (1b) in a simulation we assume that every arc between every pair of nodes can be either present or absent; (2) dependence assumptions are formulated about how ties emerge relative to each other (e.g. the importance of inter-visibility for communication); (3) configurations or network building blocks are selected that best represent the dependence assumptions (e.g. reciprocity and 2-path); (4) different types of models are created (e.g. a model without dependence assumptions (Bernoulli random graph model) and one with the previously selected configurations) and the frequency of all configurations in the graphs simulated by these models is determined; (5) the number of configurations in the graphs simulated by the models are compared with those in the observed network and interpreted.

assumptions representing the proposed processes generating the network, i.e. we decide how arcs affect each other's presence or absence. Five theoretical dependence assumptions about how lines of sight are established are commonly formulated by archaeologists studying visibility networks (Fig. 3). Firstly, the assumption that if communication or signalling was considered important, then settlements will be expected to be inter-visible. Secondly, in order for a given site to visually control surrounding settlements, these need to be visible from it. Thirdly, if a settlement is purposefully visually prominent it needs to be visible from surrounding settlements. Fourthly, if visual isolation is considered important settlements will be expected to be invisible from surrounding settlements. Finally, lines of sight are expected to emerge independent of each other in a purely random fashion with equal probability if inter-settlement visibility did not influence site location. 3. The dependence hypothesis implies a particular form to the model: the dependence assumptions formulated above can be represented by particular configurations. This means the theoretical assumptions of how lines of sight between settlements emerge have to be represented as network data, i.e. nodes and arcs. The researcher should select those configurations to be included in the model that are considered the best representation of the dependence assumptions (Fig. 3). Multiple configurations can be included 4. Simplification of models: the hypotheses archaeologists formulate can often be a complex mix of the assumptions introduced above. Although multiple configurations can be included in an ERGM to represent this complexity from the start, this should be avoided since the more configurations included in the model, the harder it becomes to understand which configuration causes a good fit between the model and the observed visibility network. For this reason it is recommended that one

starts with a simple model with few effects, and gradually build up the complexity of the model by adding more effects. The simplest assumption mentioned in point 2 above is that lines of sight emerge independently of each other. Such an assumption could be represented by an ERGM with only one effect, which is the probability that an arc will be created (such models are }s and Re nyi, 1959). called Bernoulli random graph models; Erdo After evaluating and interpreting how well this model fits with the observed networks, one could then increase the complexity of the model by adding more effects (e.g. a reciprocity effect). Moreover, one should also consider whether some effects can be equated or related in some way, in order to limit the number of effects included in the model. 5. Estimate and interpret model parameters: the previous four steps are arguably the most important ones, since they determine the formulation of theoretical assumptions, their representation as network data, and the creation of a proposed model. The goal of an ERGM is to find a set of parameter values (representations of how important particular configurations are for generating the observed patterns) that best represent a single observed network. The observed network can then be interpreted in light of these configurations and the dependence assumptions underlying them. The researcher runs the model and estimates parameter values for all of the configurations in the model, i.e. whether certain configurations have a positive or negative tendency of appearing in the simulated networks. This is an iterative process where parameter values are gradually refined until one ends up with a model with parameter values that give rise to networks with frequencies of the included configurations very similar to the observed network (see Section 3.4). One then performs a goodness of fit test to evaluate whether this model also gives rise to similar counts of configurations or aspects of the network's structure (such as the

T. Brughmans et al. / Journal of Archaeological Science 49 (2014) 442e454

degree distribution) that were not explicitly included in the model (see Section 3.5). This is done to confirm whether the model succeeds in generating non-modelled features of the observed networks. The final parameter values of the included configurations are subsequently interpreted. It is important not to over-interpret the results -a good rule of thumb is to only distinguish between positive or negative tendencies to creating a certain configuration, and to pay particular attention to statistically significant effects. For example, if in an ERGM for a visibility network the reciprocity effect is significant and positive (as in the case study below), one can formulate the following interpretation “in the process that led to the creation of this visibility network, there was a tendency for settlements to become inter-visible, more so than expected by chance”. In the following three sections some of these steps are described in more technical detail (see also Lusher et al., 2013; Robins et al., 2007a, 2007b; 2009). 3.3. The form of an ERGM The definition and creation process of an ERGM can be expressed more formally as equation (1). All ERGMs have the same general form. To this general form different dependence assumptions can be added depending on the hypotheses tested. Equation

dependence assumption of a particular configuration (i.e. hA cannot be zero if the frequency of configuration type A is considered to be dependent on the rest of the network); zA(x) is a count of the number of configurations A observed in x; k is a normalising quantity which ensures that equation (1) is a proper probability distribution (Robins, 2011; Robins et al., 2007a). 3.4. Estimation An ERGM goes through a process of estimation before it can be fitted to the observed networks. The process of estimation described here and applied in this case study is called the Monte Carlo Markov Chain Maximum Likelihood Estimation (MCMCMLE) approach (Koskinen and Snijders, 2013). The estimation process is aimed at refining the parameter values (the weight attributed to the configurations) by comparing the frequency of modelled configurations in the observed network against that in a distribution of random networks generated by a stochastic simulation using the approximate parameter values. These parameter values are adjusted through iterating the simulation so that the means of the values of the configuration in question can get as close as possible to the observed values. With “as close as possible” we mean: a t-ratio for the estimate of every configuration derived at every simulation; the t-ratio is calculated as follows: Calculating the t-ratio of an estimated configuration:

ðobserved frequency of configuration  mean of simulated frequencies of configurationÞ standard deviation of simulated frequencies

(1) describes a general probability distribution of networks, where the probability that a particular network will exist in this distribution (Pr) is dependent on the configuration parameter in the model (hA ) and the count of this configuration in the observed network (zA ðxÞ). On the left hand side of this equation we distinguish between the randomly generated network (X) and the observed network (x), both have the same number of nodes (or settlements in the case study presented below). Between every pair of nodes there can either be an arc or not (i.e. settlements can either be connected by a line of sight or not). In the observed visibility network we know exactly which nodes are connected by a line of sight, but in randomly generated networks this line of sight will be created with a certain probability. This probability is determined by the effects one includes in the ERGM. More formally, for every pair i and j that are distinct nodes of a set N of n nodes, a random variable Xij exists, where Xij ¼ 1 if there is an arc from node i to node j, and Xij ¼ 0 if there is no arc. If Xij is an arc random variable that can have a value 1 or 0 with a certain probability, then let xij be the observed value (the arc that is part of our observed visibility network). Similarly, we define X as the matrix of all variables and x the matrix of observed ties. Since nodes are not supposed to have self-loops the diagonal of these matrices are empty cells. In the case of our visibility networks, X is directed which means that Xij is different from Xji (Robins, 2011; Robins et al., 2007a). The general form of an ERGM is:

447

(2)

The t-ratio indicates how well the estimate has converged with the observed data; a good convergence is indicated by t-ratios for parameter estimates of less than 0.1 in absolute value. These final parameter values are called the maximum likelihood values. Statistically significant effects (here indicated by *) have a parameter estimate in absolute value more than twice the standard error. Table 1 offers an example of this for a visibility network. We see that the t-ratio for the reciprocity effect is less than 0.1, indicating that the estimated parameter value for reciprocity produces a similar frequency of this configuration as has been observed in the visibility network. Moreover, the estimate is positive and more than twice the standard error, indicating there is a significant tendency towards the creation of inter-visible arcs, more than expected by chance. 3.5. Goodness of fit and interpretation

(1)

Once maximum likelihood values are obtained for the configurations included in the model the ERGM needs to be fitted to the observed network. This is done to evaluate whether the frequency of observed configurations included in the model are well reproduced by the model, as well as to check if all the other features of the observed network that are not explicitly modelled are replicated (e.g. degree distribution). The rationale behind this “goodness of fit” test is that the plausibility of an ERGM is higher if it can replicate all or most of the features of an observed network (Robins et al., 2007b: 206). The guidelines set out in Harrigan (2007) for determining whether the goodness of fit results suggest that the model is plausible are commonly used:

where the summation is over all configuration types A; hA is a parameter corresponding to configuration type A, it reflects the

1. “If the parameter was estimated and specified in your model … then the t-statistic needs to be below 0.1 (as it was in the estimation).”

PrðX ¼ xÞ ¼

( )   X 1 exp hA zA ðxÞ k A

448

T. Brughmans et al. / Journal of Archaeological Science 49 (2014) 442e454

Table 1 Example estimate of one effect in an ERGM for a visibility network (fragment of the case study results presented in Table 5). Effects

Estimates

Standard error

t-Ratio

Reciprocity

8.00

0.79

0.07

*

2. “If the parameter was not estimated and specified in your model … then the t-statistic should be below 2 for the model to not be a bad fit.” To give an example, Table 2 offers a fragment of the goodness of fit results of an ERGM for a visibility network, where the reciprocity effect was included in the model whilst the 2-in-star effect was not. It shows that the mean number of these configurations in the simulated networks is very close to their frequency in the observed network. Moreover, this table offers the information needed to calculate the t-ratio as described in equation (2) above. For the reciprocity effect this is (32e31.966)/2.612 ¼ 0.013 and for the 2-instar effect this is (78e79.134)/21.836 ¼ 0.052. Since both are lower than 0.1 in absolute value they indicate a good fit between the model and the observed network. The results of the estimated configurations are subsequently interpreted. A positive parameter estimate indicates a tendency to form this particular configuration higher than purely by chance and a negative parameter estimate indicates that the configuration appears less often than expected purely by chance. Robins et al. (2009, Table 1) provide a useful key for technical interpretation of effects in ERGMs of directed networks, such as visibility networks. Although a technical interpretation of ERGM results is a necessary first step, this should always be followed by a discussion of the importance and implications of the results within the archaeological research context. In the case study presented below we provide a brief example of this process. 4. Case study: visibility networks in Iron Age Southern Spain4 In this section we provide an applied example of an archaeological hypothesis concerning visibility networks that can be addressed with ERGM. We describe the decisions made in the creation process of a visibility network and an ERGM, and how these decisions affect the outcomes of the simulated networks. This example is drawn from a study of inter-settlement visibility networks in Iron Age and Roman Southern Spain (Brughmans et al., in press), performed in the framework of the ‘Urban Connectivity in Iron Age and Roman Southern Spain’ project. The Supplementary material includes a KML file with all site locations, the visibility network matrix used here, an index of all sites, and the goodness of fit tables for the models shown. 4.1. The hypothesis The Iron Age (early 5th c. BC to late 3rd c. BC) settlement pattern in Southern Spain consisted of large and often fortified nuclear settlements, sometimes labelled as oppida. These settlements were regularly spaced in the landscape, housed substantial populations dependent on agriculture and formed the nuclei for surrounding n, 1998; Ruiz smaller rural settlements (Escacena and Bele

4 The software used for creating ERGMs in this case study is PNET (Harrigan, 2007; Robins et al., 2007a, 2007b; 2009; Robins, 2011; Wang et al., 2009). Alternatives to PNET include the ‘ergm’ package as part of the ‘statnet’ package in R (Handcock et al., 2003), as well as StOCNET.

Table 2 Example of the goodness of fit results of an ERGM for a visibility network (fragment of the case study results). Effects

Observed

Mean

Standard deviation

t-Ratio

Reciprocity 2-in-star

32 78

31.97 79.13

2.61 21.84

0.01 0.05

Rodríguez, 1997). Iron Age sites in southern Spain are often located on hilltops, terraces or at the edges of plateaux, and at some of these sites there is evidence of defensive architecture. These combinations of features may indicate that settlement locations were purposefully selected for their defendable nature and the ability to visually control the surrounding landscape, or even for their inter-visibility with other urban settlements. The same arguments have been put forward for eastern Andalucía (Ruiz and Molinos, 1993) and the Mediterranean coast of Spain (Grau Mira, 2004). However, the degree of visual control or inter-visibility of Iron Age settlements has never been formally studied for this area. Visibility networks between Iron Age settlements have been the focus of study in eastern Spain (Grau Mira, 2003, 2005; Ruestes , 2008), but inter-settlement visibility analyses in Southern Bitria lez, 2011). Archaeologists have Spain are rare (e.g. Garrido Gonza argued that it is probable that the patterns of visibility between settlements in this area were, at least in part, intentionally created and that these patterns played a role in structuring the interactions between Iron Age communities (Grau Mira, 2005). More specifically, two mechanisms were suggested: (a) the importance of inter-visibility of sites identified as oppida for communicating information through signalling, and (b) the ability of oppida to exercise visual control over surrounding settlements and landscapes (Grau Mira, 2003, 2005; Ruestes Bitri a, 2008). Both the intervisibility and visual control have been suggested to “favour the social interaction and protection of the people” (Grau Mira, 2005, 331). Yet to state that these patterns might have been intentionally created, implies a sequential creation of lines of sight aimed at allowing for inter-visibility and visual control. As the studies by  illustrate, the outcome of this process Grau Mira and Ruestes Bitria can be identified and analysed with exploratory analysis of visibility networks, but the testing of the process itself and whether or not it was intentionally implemented require a statistical simulation approach. In this case study we will illustrate the use of ERGM for exploring the hypothesis that inter-visibility and visual control were purposefully established in the settlement pattern of Iron Age II Southern Spain. This will be done by evaluating the extent to which certain processes representing inter-visibility and visual control can give rise to the observed visibility network. Inline Supplementary Table S1 can be found online at http://dx. doi.org/10.1016/j.jas.2014.05.027. 4.2. Creating visibility networks The first problem in a study of visibility networks comes with the creation of the network itself. There is no single visibility network that captures the way in which lines of sight affected the behaviour of past communities. A large number of factors need to be considered, and different decisions for how to deal with these can lead to visibility networks with very different structures: the selection of the study area boundary, the accuracy and resolution of the Digital Elevation Model (DEM), the selection of observation points and observer height, knowledge of past natural and architectural features affecting visibility. However, all of these decisions

T. Brughmans et al. / Journal of Archaeological Science 49 (2014) 442e454

depend on the theoretical assumptions of the researcher and the quality of the available data. Our aim to test the above-mentioned hypothesis motivated our decision to represent settlements as nodes and lines of sight as arcs. A collection of 159 sites for which the location is known and which have evidence of occupation in the Iron Age period was included. These sites include both major Iberian settlements and rural settlements, but our knowledge of the rural settlements is less complete; this will be taken into account when interpreting the results. The settlement pattern focuses on the area between modern Seville  rdoba, the western half of the lands inhabited by the Turand Co detani, a people living in the lower Guadalquivir valley. It is most ~ a area to the South of the river Guadense in the fertile Campin dalquivir, and less dense in the hills north and south of the river (Fig. 5). This collection of sites represents the best of our knowledge up to 2005 when data collection for the ‘Urban Connectivity in Iron Age and Roman Southern Spain’ project was finished. A 35 m resolution DEM was created with the ‘Topo to Raster’ interpolation method in ArcGIS 9.3 (selected because it recreates a more correct representation of ridges from input point and contour data, features that have a significant impact on the results of visibility analyses), using point and contour line data (source: ICA, Junta de Andalucía; contour interval 10 m). Lines of sight were derived by performing a probable viewshed (Fisher, 1992, 1995) with 100 iterations for each site. This allowed us to distinguish between lines of sight of high and low probability. A single observer point per site was used for the visibility analysis, since for the vast majority of sites there was no data available of the occupied area and extent of settlement. However, the decision to use a single observer location is problematic and its impact on the results will

449

need to be evaluated in future work for smaller study areas with a better knowledge of occupied settlement areas. An observer height of 1.7 m was assumed. Moreover, we assumed that the observed location height is that of the DEM cell on which the observed settlement is located, since very few settlements have evidence of architectural features that could be included in the analysis. The resulting visibility network (Fig. 6) was subsequently analysed using exploratory network measures, which revealed it consists of a number of components and three areas with a higher density of lines of sight (Brughmans et al., in press). The network used in the ERGMs includes lines of sight up to 20 km, at which distance fire and smoke signals are still visible, and with a probability higher than 50% (an arbitrary threshold, but a sensitivity analysis of network measures using different thresholds showed that it captures the key features of the network's structure (Brughmans et al., in press)). The final network, which will now be referred to as the “observed network” (prepared in the SNA software package UCINET), includes 159 nodes connected by 84 arcs. 4.3. Creating ERGMs The ERGM creation process is one of trial-and-error, where our theoretical assumptions and the structure of the observed network motivate the creation of multiple models, the results of which are then compared with the observed network. The dependence assumptions we are interested in and the configurations used to represent them are those discussed in Section 3.2 and shown in Fig. 3. As a first step, we considered the simplest dependence assumption by creating a Bernoulli random graph model: lines of sight emerge independently of one another with a certain

Fig. 5. Location of the 159 sites occupied in the Iron Age II in Southern Spain. An index of all sites mentioned here is included in the Supplementary material.

450

T. Brughmans et al. / Journal of Archaeological Science 49 (2014) 442e454

Fig. 6. (a) Topological and (b) geographical representations of the visibility network used in this case study. All lines of sight are up to 20 km length and have a probability larger than 50%.

T. Brughmans et al. / Journal of Archaeological Science 49 (2014) 442e454 Table 3 Count of the frequency of each configuration in the observed network. Configurations we consider particularly good representation of the dependence assumptions we are interested in are given in bold. See Wang et al. (2009) for a description of each configuration. Configuration

Frequency

Configuration

Frequency

Vertices

159

Alternating-instar Alternating-outstar Ain1out-star(2.00) 1inAout-star(2.00) AinAout-star(2.00) AT-T(2.00) AT-C(2.00) AT-D(2.00) AT-U(2.00) AT-TD(2.00) AT-TU(2.00) AT-DU(2.00) AT-TDU(2.00) A2P-T(2.00) A2P-D(2.00) A2P-U(2.00) A2P-TD(2.00) A2P-TU(2.00) A2P-DU(2.00) A2P-TDU(2.00)

54.5625

Arc Reciprocity in-2star out-2star in-3star out-3star 2-path T1 T2 T3 T4 T5 T6 T7 T8 Transitive triad T10(030C) Sink Source Isolates

84 32 78 73 59 57 147 6 41 46 23 23 53 129 122 52 17 8 11 104

Table 4 ERGM for Iron Age II visibility network limited to 20 km and with arcs >50% probability. Asterisks indicate significant effects for which absolute value of estimates are more than twice the standard error. A positive estimate value indicates a tendency towards the creation of this type of configuration, while a negative estimate value indicates a tendency against this configuration. This ERGM does not succeed in reproducing all aspects of the structure of the observed network. Effects

Estimates

50.5625 86.0625 87.125 48.875 43.375 42.875 44 42.5 43.6875 42.9375 43.25 43.29167 133.125 65.25 71.125 99.1875 102.125 68.1875 89.83333

probability. This means that the first ERGM we created included only one parameter, the probability of arc creation. When we perform a goodness of fit test to compare the structure of the observed network with that of networks simulated by the Bernoulli random graph model, we notice that they are significantly different (in Inline Supplementary Table 2 the t-ratios for all configurations were much higher than 2).5 The results suggest that the observed visibility network was not created by a purely random process. This is not a surprising result, since social processes are rarely random. The Bernoulli random graph model should be considered as a benchmark of how the data would be expected to look like if the dataset was created at random. The results presented here allow us to explore alternative models that better succeed in reproducing the observed network. Inline Supplementary Table S2 can be found online at http://dx. doi.org/10.1016/j.jas.2014.05.027. A good starting point when creating ERGMs that contain more configurations is to explore the structure of the observed network, and in particular the frequency of each configuration (Table 3). From Table 3 we learn that the configurations which we consider most suitable representations of our dependence assumptions, shown in bold, are common in the observed network, but other configurations are even more common. This suggests that a model which only includes the configurations representing the dependence assumptions we focus on here might not succeed in reproducing all aspects of the structure of the observed network. To evaluate this we decided to first estimate a model with only the configurations which we are most interested in, representing intervisibility (reciprocity), visual isolation (isolates), visual control (Alternating-out-star), and visual prominence (Alternating-in-star, see Fig. 3). The results are presented in Table 4, and indicate a positive and significant tendency towards settlements being intervisible, and towards some settlement being far more visually

5 Goodness of fit test with 50,000,000 simulated networks from which a sample of 1000 networks was drawn at regular intervals. From this distribution of networks the mean frequency of each configuration (which is normally distributed) was used to calculate the t-ratio.

451

Reciprocity 8.06 Isolates 0.29 Alternating-in-star 1.13 Alternating-out0.76 star

Standard error

t-Ratio

0.70 0.50 0.27 0.27

0.02* 0.03 0.01* 0.04*

prominent or controlling than other settlements. The negative tendency towards settlements being visually isolated is not significant. However, as feared, the goodness of fit results indicate that this model does not succeed in reproducing some of the configuration frequencies seen in the observed network, specifically the degree of clustering and a number of triangle configurations (see Inline Supplementary Table 3). Nevertheless, the results also show that a model with just four effects succeeds in reproducing the indegree and outdegree distributions, which suggest that the configurations responsible for this (alternating-in-star and alternating-out-star) should definitely be retained in further models. Inline Supplementary Table S3 can be found online at http://dx. doi.org/10.1016/j.jas.2014.05.027. These results indicate that this particular ERGM is not a good model for understanding the processes that led to the observed visibility network. Therefore, we decided to create a final model which includes additional configurations that might make up for the badly reproduced aspects of the observed network as identified by the goodness of fit results. The final ERGM is shown in Table 5 and a goodness of fit test suggests that this model fully succeeds in reproducing the structure of the observed network (see Inline Supplementary Table 4). It is also important to mention that we attempted to estimate a number of alternative models using configurations which we believed to be theoretically feasible. However, none of these succeeded in reproducing all aspects of the observed network's structure. We also estimated models for a network that includes lines of sight up to 50 km, to evaluate how the ERGM creation process is sensitive to the boundary selection of the observed network. Although a good model for this network was found, it only pointed out a significant and positive reciprocity effect, as well as a few other non-significant effects with similar results to the ERGM presented in Table 5. Considering lines of sight up to 50 km hardly Table 5 ERGM for Iron Age II visibility network limited to 20 km and with arcs >50% probability. Asterisks indicate significant effects for which absolute value of estimates are more than twice the standard error. A positive estimate value indicates a tendency towards the creation of this type of configuration, while a negative estimate value indicates a tendency against this configuration. This ERGM succeeds in reproducing the structure of the observed network. Effects

Estimates

Standard error

t-Ratio

Reciprocity path2 Transitive triad Sink Source Isolates Alternating-in-star Alternating-out-star

8.00 0.52 0.40 2.29 1.28 3.23 2.35 2.61

0.79 0.17 0.05 1.33 1.39 1.53 0.92 0.96

0.07* 0.07* 0.06* 0.05 0.03 0.10* 0.08* 0.08*

452

T. Brughmans et al. / Journal of Archaeological Science 49 (2014) 442e454

allows for the identification of settlements, smoke and fire signals, and therefore the dependence assumptions formulated make little sense for this network. This caused the resulting ERGM to have very few significant results, making its explanatory potential very limited. From this exercise we conclude that the results of an ERGM are only informative if the selection of the boundaries of the network are theoretically motivated, and if the tested dependence assumptions make sense within these boundaries. Inline Supplementary Table S4 can be found online at http://dx. doi.org/10.1016/j.jas.2014.05.027. 4.4. Interpreting the results We have now created an ERGM that manages to reproduce our observed network. But what does it really tell us about the past phenomenon we are trying to understand? How does this model help us study the way in which lines of sight between settlements structured past human behaviour? A good process for interpreting these statistical results is to start with a technical description of the model results (we will use Fig. 3 to help us with this), followed by an interpretation of what this means in archaeological terms, and finally embedding these results within a wider archaeological research context, confronting it with other sources of evidence and approaches. Since this is merely an example of the use of ERGMs for visibility networks we will only present the first two steps here. The importance of these results within a much wider chronological framework is discussed in more detail in Brughmans et al. (in press). First of all, we notice more significant effects in the model in Table 5 which showed a good fit to the observed network than in that in Table 4 which showed a poor fit, suggesting that a more complex mix of factors than that presented in the model in Table 4 led to the creation of the observed network. It is crucial to realise that none of these effects can be understood in isolation but need to be interpreted together, since each effect affects every other in the model. We see a significant positive reciprocity effect, indicating a tendency for settlements to become inter-visible. The significant positive alternating-in-star and alternating-out-star effects indicate a tendency towards a spread of the indegree and outdegree distributions. This means that there is a tendency for a few settlements to be far more visually controlling or visually prominent than most other settlements. Moreover, these effects are combined with a significant negative 2-path effect, suggesting that settlements that are visually prominent tend not to be visually controlling. It also suggests that paths through the network necessary for passing on information through signalling might not have been purposefully established. Instead, we notice a significant positive transitive triad effect, indicating that where 2-paths existed they tended to be closed. We believe these results to be a better representation of a process leading to clusters of inter-visible settlements around a visually controlling settlement, than of a process leading to a communication network. A significant negative isolates effect indicates that in general settlements do tend to be included in the visibility network. This requires some explanation, because although we notice a very high number of isolated nodes (104 out of 159), this model's results tells us that in conjunction with the other effects in the model there are fewer isolated nodes than we would expect purely by chance. Finally, we see a negative tendency towards the emergence of sinks (no outgoing lines of sight) and sources (no incoming lines of sight), effects that are not significant and should therefore be interpreted with caution. We conclude that a network creation process which purposefully establishes inter-visibility and visual control can give rise to the structure of the observed visibility network. However, such a

process is complex and cannot be understood without other effects in the model. Most crucially, there seems to be no tendency towards the establishment of paths of inter-visibility, necessary for a wellfunctioning signalling network. Instead, we believe a process creating clusters around visually controlling settlements is better supported by the observed network. 5. Discussion and conclusions In this paper we have introduced exponential random graph modelling as a method that has great potential for enhancing the study of archaeological data networks, and of visibility networks in particular. We have argued that archaeological network analysts have tended to focus on exploring the outcomes of past phenomena, rather than on the processes that give rise to them. We believe that the dependence assumptions archaeologists formulate when representing archaeological data as networks is key to overcoming this problem, and ERGM offers a formal method for doing so. ERGM forces the researcher to explicitly formulate their theoretical assumptions of what nodes and arcs represent, and of how one arc affects the presence or absence of any other arc in the network. It offers a way to represent such theoretical dependence assumptions as network data, using small sets of nodes and arcs called configurations. Finally, with ERGMs one can test whether the dependence assumptions we formulate can actually give rise to an observed network or not, and one can explore a wide range of abstract processes which could be considered representations of alternative scenarios. In doing so, ERGMs allow archaeologists to formulate new hypotheses of the past processes that drove the phenomena they are interested in, which are focused on the more narrow range of processes which the ERGMs suggest can lead to the datasets available to us. The case study presented here also suggests ERGM makes a promising methodological contribution to the study of visibility networks, both as an exploratory method and as a method for hypothesis testing. Scholars studying visibility networks have tended to support their hypotheses of the way in which lines of sight affected human behaviour by identifying patterns in static networks. Yet many hypotheses concerning the intentional establishment of certain patterns imply a particular process. ERGM is a method that allows one to test whether our assumptions about these processes can actually give rise to the observed visibility network, and it offers a new tool for exploratory visibility network analysis by breaking these networks down into the configurations they are built out of. The case study has also taught us that data quality and theory are crucial for formulating models of past phenomena. It is clear that ERGMs are only as reliable as the datasets they are based on, suggesting that for some datasets (e.g. longdistance visibility networks) it is difficult to make statistical inferences about processes with much certainty. One needs to be confident that the observed network is as close as possible to complete, or one needs to make theoretical arguments why the created network can be used for testing one's hypotheses. Our experiment of modelling a network with lines of sight up to 50 km illustrated this most clearly: this network did not allow for the evaluation of hypotheses concerning inter-visibility or visual control among settlements, therefore the effects included in ERGMs were not suitable for explaining the network's creation. The impact of so-called ‘edge-effects’ by determining the geographical boundary of the network of interest and the impact of thresholding the network on the probability or distance of its lines of sight need to be evaluated more thoroughly in future work through sensitivity analyses (as suggested by Peeples and Roberts, 2013). The case study showed ERGM is a promising method for identifying a range of processes that could give rise to the observed

T. Brughmans et al. / Journal of Archaeological Science 49 (2014) 442e454

network. However, it currently cannot include one crucial factor: topography. ERGM happens in an abstract topological space, not in geographical space. This means that ERGM does not allow one to evaluate how probable it is for an observed network to emerge by chance in a particular landscape, i.e. to what extent it is a product of the landscape rather than of social processes. Since ERGM was not developed to answer such a question, we argue that currently its main benefit lies in taking a network out of its geographical context and explicitly exploring one's assumptions about social processes. Including a geographical parameter in such an already very complex model would make it even more complex and less easy to identify which effects give rise to the patterns of interest. Moreover, an evaluation of the impact of the landscape will be more computationally intensive, and future work along these lines would therefore very much benefit from the focus offered by an ERGM that has narrowed down the possible range of processes to those which are possible and significant in abstract space. However, we believe that the ability to incorporate topography as a parameter in ERGMs would be a valuable addition to an already useful method, one that cannot be performed at this stage and will need to be developed in close collaboration with statisticians. A number of other methodological issues will need to be addressed in future work. Firstly, multiple viewer points per site could be considered since this significantly affects the structure of the visibility network. However, this can only be done in research contexts where one has reliable information about the exact area of occupation. We believe a valuable case study would be to estimate an ERGM for a much smaller area focussing on one oppidum and surrounding rural settlements which are particularly well studied. Secondly, another valuable experiment would be to evaluate the use of ERGM for visibility networks by comparing the process suggested by an ERGM with an observed process of settlements established in a known order, within a research area where such detailed information is available. Acknowledgements The ‘Urban Connectivity in Iron Age and Roman Southern Spain’ project directed by Prof. Simon Keay and Dr. Graeme Earl was funded by the UK Arts and Humanities Research Council (AHRC) between 2002 and 2005 with subsequent support by the University of Southampton and institutions in Seville. We would like to thank pez Monde jar for Cat Cooper for help with the maps, Leticia Lo bibliographical suggestions, Iza Romanowska for advice and reading an early draft, and Dave Wheatley for his ArcGIS Python script for probable viewsheds. Appendix A. Supplementary data Supplementary data related to this article can be found at http:// dx.doi.org/10.1016/j.jas.2014.05.027. References Anderson, C.J., Wasserman, S., Crouch, B., 1999. A p* primer: logit models for social networks. Social Netw. 21, 37e66. Brughmans, T., 2010. Connecting the dots: towards archaeological network analysis. Oxf. J. Archaeol. 29, 277e303. Brughmans, T., Keay, S., Earl, G.P., 2014. Understanding inter-settlement visibility in Iron Age and Roman Southern Spain with exponential random graph models for visibility networks. J. Archaeol. Method Theory. (in press). Conolly, J., Lake, M., 2006. Geographical Information Systems in Archaeology. Cambridge University Press, Cambridge e New York. Davidson, D.A., 1979. The Orcadian environment and cairn location. In: Renfrew, C. (Ed.), Investigations in Orkney. Thames and Hudson, London, pp. 7e20. De Montis, A., Caschili, S., 2012. Nuraghes and landscape planning: coupling viewshed with complex network analysis. Landsc. Urban Plan. 105, 315e324. }s, P., Re nyi, A., 1959. On random graphs. Publ. Math. 6, 290e297. Erdo

453

n, M., 1998. Pre-Roman Turdetania. In: Keay, S. (Ed.), The Escacena, J.L., Bele Archaeology of Early Roman Baetica. Journal of Roman Archaeology Supplement Series 29. J. Roman Archaeol. Portsm. Rhode Isl., 23e37. Fisher, P.F., 1992. First experiments in viewshed uncertainty: simulating fuzzy viewsheds. Photogramm. Eng. Rem. Sens. 58, 345e352. Fisher, P.F., 1995. An exploration of probable viewsheds in landscape planning. Environ. Plan. B: Plan. Des. 22, 527e546. Fraser, D., 1983. Land and society in Neolithic Orkney. BAR British Series 117. Archaeopress, Oxford. lez, P., 2011. La Ocupacio  n Romana del Valle del Guadiamar y la Garrido Gonza  n Minera (Unpublished PhD thesis). Universidad de Sevilla, Sevilla, ISBN conexio 9788469478226. URL. http://fondosdigitales.us.es/tesis/tesis/1564/laocupacion-romana-del-valle-del-guadiamar-y-la-conexion-minera/ (accessed 31.03.14.). Golitko, M., Meierhoff, J., Feinman, G.M., Williams, P.R., 2012. Complexities of collapse: the evidence of Maya obsidian as revealed by social network graphical analysis. Antiquity 86, 507e523. Grau Mira, I., 2003. Settlement dynamics and social organization in eastern Iberia during the Iron Age (eighth-second centuries BC). Oxf. J. Archaeol. 22, 261e279. http://dx.doi.org/10.1111/1468-0092.00187.  n del paisaje ibe rico: aproximacio  n SIG al terriGrau Mira, I., 2004. La construccio rico de la Marina Alta. SAGVNTVN (P.L.A.V.) 36, 61e75. torio protohisto Grau Mira, I., 2005. Romanization in Eastern Spain: a GIS approach to Late Iberian Iron Age landscape. In: Berger, J.-F., Bertoncello, F., Braemer, F., Gourguen, D.,  te , Analyses et Gazenbeek, M. (Eds.), Temps et Espaces de L'homme En Socie les Spatiaux En Arche ologie. XXVie me Rencontres Internatioales D'archMode  ologie et D'histoire d'Antibes. Editions e APDCA, Antibes, pp. 325e334. Handcock, M.S., Hunter, D.R., Butts, C.T., Goodreau, S.M., Morris, M., 2003. statnet: Software Tools for the Statistical Modeling of Network Data [WWW Document]. URL. http://statnetproject.org (accessed 31.03.14.). Harrigan, N., 2007. PNet for Dummies: an Introduction to Estimating Exponential Random Graph (p*) Models with PNet (accessed 31.03.14.). Holland, P.W., Leinhardt, S., 1981. An exponential family of probability distributions for directed graphs. J. Am. Stat. Assoc. 76, 33e65. Koskinen, J., Snijders, T.A.B., 2013. Simulation, estimation, and goodness of fit. In: Lusher, D., Koskinen, J., Robins, G. (Eds.), Exponential Random Graph Models for Social Networks. Cambridge University Press, Cambridge, pp. 141e166. Lazega, E., van Duijn, M.A.J., 1997. Position in formal structure, personal characteristics and choices of advisors in a law firm: a logistic regression model for dyadic network data. Social Netw. 19, 375e397. Llobera, M., 2003. Extending GIS-based visual analysis: the concept of visual scapes. Int. J. Geogr. Inf. Sci. 17, 25e48. http://dx.doi.org/10.1080/13658810210157732. Llobera, M., 2007. Reconstructing visual landscapes. World Archaeol. 39, 51e69. Lusher, D., Koskinen, J., Robins, G., 2013. Exponential Random Graph Models for Social Networks. Cambridge University Press, Cambridge. Milo, R., Shen-Orr, S., Itzkovitz, S., Kashtan, N., Chklovskii, D., Alon, U., 2002. Network motifs: simple building blocks of complex networks. Science 298, 824e827. Moreno, J.L., Jennings, H.H., 1938. Statistics of social configurations. Sociometry 1, 342e374. Peeples, M.A., Roberts, J.M., 2013. To binarize or not to binarize: relational data and the construction of archaeological networks. J. Archaeol. Sci. 40, 3001e3010. http://dx.doi.org/10.1016/j.jas.2013.03.014. PNET: http://sna.unimelb.edu.au/PNet (accessed 30.07.13.). Robins, G., 2011. Exponential random graph models for social networks. In: Scott, J., Carrington, P.J. (Eds.), The SAGE Handbook of Social Network Analysis. Sage, London, pp. 484e500. Robins, G., Pattison, P., Kalish, Y., Lusher, D., 2007a. An introduction to exponential random graph (p*) models for social networks. Social Netw. 29, 173e191. Robins, G., Snijders, T., Wang, P., Handcock, M., Pattison, P., 2007b. Recent developments in exponential random graph (p*) models for social networks. Social Netw. 29, 192e215. Robins, G., Pattison, P., Wang, P., 2009. Closure, connectivity and degree distributions: exponential random graph (p*) models for directed social networks. Social Netw. 31, 105e117. , C., 2008. A multi-technique GIS visibility analysis for studying visual Ruestes Bitria control of an Iron Age landscape. Internet Archaeol. 23. http://intarch.ac.uk/ journal/issue23/4/index.html (accessed 30.07.13.). Ruiz Rodríguez, A., 1997. The Iron Age Iberian peoples of the upper Guadalquivir valley. In: Díaz-Andreu, M., Keay, S. (Eds.), The Archaeology of Iberia. The Dynamics of Change. Routledge, London e New York, pp. 175e191.  gico de un proceso histo rico. Ruiz, A., Molinos, M., 1993. Iberos. An alisis arqueolo Crítica, Barcelona. Shemming, J., Briggs, K., 2014. Anglo-saxon Communication Networks [WWW Document]. URL. http://keithbriggs.info/AS_networks.html (accessed 31.03.14.). StOCNET: http://www.gmw.rug.nl/~stocnet/StOCNET.htm (accessed 28.03.14.). Swanson, S., 2003. Documenting prehistoric communication networks: a case study  polity. Am. Antiq. 68, 753e767. in the Paquime Tilley, C.Y., 1994. A Phenomenology of Landscape. Berg, Oxford. UCINET: https://sites.google.com/site/ucinetsoftware/home (accessed 30.07.13.). Wang, P., Robins, G., Pattison, P., 2009. PNet. Program for the Simulation and Estimation of Exponential Random Graph (p*) Models. User Manual. Wasserman, S., Pattison, P., 1996. Logit models and logistic regressions for social networks: I. An introduction to Markov graphs and p*. Psychometrika 61, 401e425.

454

T. Brughmans et al. / Journal of Archaeological Science 49 (2014) 442e454

Wheatley, D., Gillings, M., 2000. Vision, perception and GIS: developing enriched approaches to the study of archaeological visibility. In: Lock, G.R. (Ed.), Beyond the Map: Archaeology and Spatial Technologies. IOS Press, Amsterdam, pp. 1e27.

Wheatly, D., Gillings, M., 2002. Spatial Technology and Archaeology. The Archaeological Applications of GIS. Taylor & Francis, London e New York.