Biological Journal of the Linnean Society (1997), 60: 73–93. With 6 figures
A simulation study of microevolutionary inferences by spatial autocorrelation analysis ROBERT R. SOKAL FMLS Department of Ecology & Evolution, State University of New York, Stony Brook, NY 11794-5245, U.S.A. NEAL L. ODEN The EMMES Corporation, 11325 Seven Locks Road, Suite 214, Potomac, MD 20854, U.S.A. AND BARBARA A. THOMSON Department of Ecology & Evolution, State University of New York, Stony Brook, NY 11794-5245, U.S.A.
Received 2 August 1995, accepted for publication 16 February 1996
To explore the extent to which microevolutionary inference can be made using spatial autocorrelation analysis of gene frequency surfaces, we simulated sets of surfaces for nine evolutionary scenarios, and subjected spatially-based summary statistics of these to linear discriminant analysis. Scenarios varied the amounts of dispersion, selection, migration, and deme sizes, and included: panmixia, drift, intrusion, and stepping-stone models with 0-2 migrations, 0-2 selection gradients, and migration plus selection. To discover how weak evolutionary forces could be and still allow discrimination, each scenario had both a strong and a weak configuration. Discriminant rules were calculated using one collection of data (the training set) consisting of 250 sets of 15 surfaces for each of the nine scenarios. Misclassification rates were verified against a second, entirely new set of data (the test set) equal in size. Test set misclassification rates for the 20 best discriminating variables ranged from 39.3% (weak) to 3.6% (strong), far lower than the expected rate of 88.9% absent any discriminating ability. Misclassification was highest when discriminating the number of migrational events or the presence or number of selection events. Discrimination of drift and panmixia from the other scenarios was perfect. A subsequent subjective analysis of a subset of the data by one of us yielded comparable, although somewhat higher, misclassification rates. Judging by these results, spatial autocorrelation variables describing sets of gene frequency surfaces permit some microevolutionary inferences. ©1997 The Linnean Society of London
ADDITIONAL KEY WORDS: — spatial autocorrelation – microevolution – simulation – migration – selection – drift.
Correspondence to R. R. Sokal. Email:
[email protected] 0024–4066/97/010073 + 21 $25.00/0/bj960089
73
©1997 The Linnean Society of London
74
ROBERT R. SOKAL ET AL
CONTENTS Introduction . . . . . . . . . . . . . Material and methods . . . . . . . . . . The surface-generating program . . . . . Components of the surface-generating program Design of the simulation experiments . . . Analysis of results . . . . . . . . . . Results . . . . . . . . . . . . . . . Discussion and conclusions . . . . . . . . Sources of variation in spatial autocorrelation . Weak and strong parameters . . . . . . Practical considerations . . . . . . . . Subjective errors investigated . . . . . . Acknowledgements . . . . . . . . . . . References . . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
74 75 75 75 78 81 82 86 87 90 91 92 92 93
INTRODUCTION
Our laboratory developed spatial autocorrelation analysis as an exploratory tool for making microevolutionary inferences from surfaces of gene frequencies and morphological variables (Sokal & Oden, 1978a, b; Sokal, 1979; Sokal & Wartenberg, 1981; Sokal, 1984, 1986). In papers applied to numerous datasets from animals, plants, and humans, we inferred selection, migration, drift and isolation by distance (see Sokal, Smouse & Neel, 1986; Sokal et al., 1987a, 1992; Sokal, Oden & Barker, 1987b; Sokal, Harding & Oden, 1989a; Barbujani & Sokal, 1991; Falsetti & Sokal, 1993; and references cited therein). Spatial autocorrelation analysis is based on spatial autocorrelation statistics, customarily expressed as one-dimensional and directional (two-dimensional) correlograms. However, we include under this term the entire panoply of procedures used in our laboratory, including tests for heterogeneity among localities, clustering of correlograms and variable surfaces, and consideration of other spatial statistics, such as FST. Sokal & Wartenberg (1983) and Sokal et al. (1989b) studied the effects of microevolutionary factors on spatial autocorrelation statistics. They found that: (1) stochastic evolutionary simulations result in gene-frequency surfaces with spatial characteristics that are functions of simulation parameters, such as parent vagility and neighborhood size: (2) separate realizations of processes with identical parameters yield similar spatial correlograms; (3) spatial autocorrelation is, in general, independent of the mean of a surface; (4) migration affects surfaces and correlograms when immigrant gene-frequency differentials are substantial; and (5) selection gradients yield monotonically decreasing correlograms. Slatkin & Arter (1991) ran a series of simulation tests to test the efficacy of spatial autocorrelation in making microevolutionary inferences. Using only one-dimensional correlograms, they concluded that the stochastic variation of the spatial process was too great to permit inferences. Sokal & Oden (1991) briefly responded to the Slatkin & Arter (1991) criticism, and in the same year, Sokal & Jacquez (1991) published the first examination of the inferential power of spatial autocorrelation analysis. These authors created six simulated datasets, ranging from 12 to 15 gene frequencies, and representing six different evolutionary scenarios. The true scenarios were kept from one of the authors, who then applied standard autocorrelation analysis to each dataset. Five of the six evolutionary scenarios were correctly
SPATIAL AUTOCORRELATION ANALYSIS
75
interpreted; isolation by distance was incorrectly diagnosed. These results represent a small study and may not be convincing. In this paper we report the results of a far larger study to examine the inferential power of spatial autocorrelation analysis. We show that spatial statistics carry information discriminating among various evolutionary scenarios by computing linear discriminant functions able to distinguish the different scenarios not only of a training dataset, but also of a second, replicated test set. We examine the relative ease with which different scenarios can be distinguished and the strength of various parameters necessary to make successful discrimination possible. We also carry out subjective analyses of the spatial autocorrelation output for 25 unknown datasets and compare the success rate of this procedure with that of the discriminant function. We conclude with a discussion of strengths and limitations of spatial autocorrelation analysis.
MATERIAL AND METHODS
The surface-generating program We generated the datasets with a FORTRAN program written by NLO that simulates a diploid population located on a 21 3 21 lattice of stepping-stones. This program closely resembles the simulation program employed by Slatkin & Arter (1991). Each stepping stone is populated with n ( = 64 or 448) diploid individuals. (The model could equally well be considered as consisting of haploid individuals, but we follow Slatkin and Arter in interpreting it as diploid). The surface is initially settled with a starting gene frequency q0 obtained as described below. At each iteration the new individuals in a deme are randomly selected from an infinite population whose gene frequency is an average of the gene frequency of the same deme and its four orthogonal neighbors in the previous iteration. Interdemic dispersal is simulated by adjusting the weights in the average. The proportion of overall interdeme dispersal between the deme being considered and its adjacent neighbors is called m and is set either to 0 (for scenarios 1 and 2 below) or to 0.1172 (for scenarios 3 to 9). The latter process, equivalent to isolation by distance in a continuous spatial model, creates the spatial autocorrelation in the surfaces. Components of the surface-generating program Initial and immigrant gene frequencies To make our simulation as realistic as possible, we used actual human gene frequencies from Italy, Japan and Nigeria assembled from Roychoudhury & Nei (1988). To decide on a starting gene frequency, we first randomly selected one of 54 allele frequencies from 21 blood group systems (with replacement and constrained to range between 0.05 and 0.95) and then randomly chose one of the three countries. When an immigrant gene frequency was also required, we randomly chose one of the remaining two countries for the given allele frequency. When a second immigrant gene frequency was needed, we took the third country for that allele frequency. We excluded all cases that yielded an immigrant gene-frequency differential greater than 0.7, since such large differences are too easy to recognize.
76
ROBERT R. SOKAL ET AL
60 50
n
40 30 20 10 0
0.035 0.105 0.175 0.245 0.315 0.385 0.455 0.525 0.595 0.665 delta q
Figure 1. Total distribution of available absolute gene-frequency differentials. (n = 149 differentials, after excluding those > 0.7).
Figure 1 illustrates the total distribution of available gene-frequency differentials, sign neglected. Having chosen an initial gene frequency, the program populated the lattice with 2n genes per stepping stone. The number of mutant alleles in each stepping stone, a function of the starting gene frequency, q0, was randomly drawn from a binomial distribution (p0 + q0).2n Subsequent steps (which varied with the scenario) were dispersal, migration, selection, and reproduction, explained below. Dispersal In each generation dispersal takes place between all orthogonally-connected lattice neighbours. The formula for the updating of each stepping stone not on the lattice edge is as follows: q' = (0.8828q) + 0.1172(tqt + bqb + lql + rqr) where q' is the new gene frequency of the stepping stone, q is its previous gene frequency, t, b, l, and r are dispersal weights for top, bottom, left and right neighbors (t, b, l, and r are normalized to let their sum equal 1), and qt, qb, ql, qr are their corresponding gene frequencies. Suitable adjustments were made for stepping stones along the edge of the lattice. These and other formulas are given in terms of gene frequencies for simplicity, but the actual bookkeeping by the computer program was in terms of numbers of mutant alleles. For scenario 3, the simple stepping-stone model (see below), the four dispersal weights were all set to 1 before normalization. Migration We simulated migration from an external source by fixing the top five rows of the lattice at a given immigrant gene frequency before executing the dispersal step. This frequency was chosen in the manner already described. These rows did not enter into the final datasets, which were drawn only from rows and columns 8–14. By adjusting the dispersal weights (see Table 1), the genes were preferentially drawn
SPATIAL AUTOCORRELATION ANALYSIS
77
from the top tier, resulting in a diffusive spread of the immigrant genes from the top down. Selection After the dispersal step, the gene frequency was subjected to a selection step. A separate selection coefficient was computed for each lattice row. Selection against the mutant allele was strongest along the bottom edge (row 21), and in that row was set at sUL, the upper limit to the selection coefficient. The value of sUL was determined for each surface by randomly sampling from a truncated exponential distribution, λ ranging between 0 and s,max as given in Table 1. The distribution is: f (sUL) = ( a )e–λsUL , for 0 ≤ sUL ≤ smax, λ = 83, and a = 1–e–smax. A sample distribution of sUL values for smax = 0.05 is shown in Figure 2A. For each surface the selection coefficients form a gradient which varies linearly from sUL (bottom row) to 0 (top row). Once the selection coefficients for the entire surface had been determined, the gene frequency of each stepping stone was updated as follows: q" = q'(1 – s)/[1 – (q's)] where q" equals the updated gene frequency, q' is the gene frequency before
TABLE 1. The nine microevolutionary scenarios and their process parameters Weak parameters
Scenarios 1. Panmixia 2. Drift 3. Stepping stone 4. Stepping stone+ migration 5. Stepping stone+2 migrations 6. Stepping stone+ selection 7. Stepping stone+2 selections 8. Stepping stone+ migration+ selection 9. Intrusion
n
Dispersal weights t, b, l,
r
1 10
64 64
0, 0,
0, 0,
0, 0,
49
64
1,
1,
49
64
6,
98
64
49
No. gen.
Strong parameters Dispersal weights t, b, l, r
smax
448 448
0, 0,
0, 0,
0, 0,
0 0
0 0
49
448
1,
1,
1,
1
0
0
49
448
18,
1,
1,
1
0
1
0
98
448
18,
1,
1,
1
0
1,
1
0.05
49
448
1,
1,
1,
1
0.20
1,
1,
1
0.05
49
448
1,
1,
1,
1
0.20
1, 1,
1, 1,
1 1
0.05 0
49 49
448 448
18, 1,
1, 1,
1, 1,
1 1
0.20 0
smax
No. gen.
n
0 0
0 0
1 49
1,
1
0
1,
1,
1
6,
1,
1,
64
1,
1,
49
64
1,
49 49
64 64
6, 1,
NOTES. No. gen. is number of generations simulation was run. n is number of diploid individuals per stepping stone. Dispersal weights are t, b, l, r for top, bottom, left and right, respectively. smax is maximum selection coefficient.
78
ROBERT R. SOKAL ET AL
updating, and s is the selection coefficient specific to this stepping stone. This selection is equivalent to gametic selection. Reproduction The gene frequency of the next generation in each stepping stone is determined by randomly choosing from a normal distribution with a mean equal to q" and a variance of p"q"/2n. Although the actual distribution should be binomial, sample sizes are large enough in this study to make the normal approximation a close fit. Design of the simulation experiments The simulation experiments for all nine scenarios detailed below were carried out 400 A 350 300
n
250 200 150 100 50 0
0.0425 0.0025 0.0125 0.0225 0.0325 0.0075 0.0175 0.0275 0.0375 0.0475 SUL
120 B 96
n
72
48
24
0
0.031 0.033 0.035 0.037 0.039 0.041 0.043 0.045 0.047 0.049 SUL Figure 2. A, distribution of sUL, upper limit of the exponentially distributed selection coefficients for the weak parameters. B, distribution of sUL, upper limit of the stronger, uniformly distributed selection coefficients for the weak parameters.
SPATIAL AUTOCORRELATION ANALYSIS
79
under two sets of conditions, weak and strong parameters. The strong parameters used population sample sizes, preferential dispersal weights, and selection coefficients that were large enough to produce (on the average) marked differences from the basic pattern of the simple stepping-stone scenario. There was, of course, substantial variation from surface to surface due to stochastic variation among the realizations. The weak parameters decreased effective population size to 1/7th that of the strong parameters, and lowered both preferential dispersal weights and maximal selection coefficients as indicated in Table 1. These weaker parameters were introduced to test the sensitivity of the discriminant function near its lower limit. When microevolutionary forces are weak enough, they cannot be detected by any method. The question was — how weak can these forces be and still yield interpretable results? Unless otherwise stated, all simulations were run for 49 generations. In previous studies we had found that this number of generations led to a quasiequilibrium of gene frequencies for most combinations of parameter values. The processes described were combined in various arrangements to design nine microevolutionary scenarios. These scenarios are listed in Table 1, together with the parameter determining them. They are the following: 1. Panmixia. These surfaces were simulated for one generation only as the sample size for all stepping stones was large enough that Hardy-Weinberg equilibrium kept the population at a fixed gene frequency each generation. To generate a surface for panmixia, we settled a lattice with an initial gene frequency and terminated the program. 2. Drift. To simulate surfaces subject to genetic drift but no other microevolutionary forces, we set smax to 0 and prevented dispersal by setting all dispersal weights to 0. Thus the only force perturbing gene frequencies in these data was the sampling error during the reproduction step of the simulation, leading to genetic drift. For the weak parameter simulations, we terminated the process after 10 generations, since otherwise the gene frequencies would have drifted too close to fixation, making the detection of this scenario too simple for the discriminant function or a subjective observer. For the strong parameter simulations, the sample size was large enough that the effect of drift was substantially diminished. In these simulations, we proceeded to the usual 49 generations. 3. Stepping stone. This is not the generalized stepping-stone model of the population genetics literature which requires that there be some gene flow from all stepping stones to a given stepping stone. Our model is the classical one as defined by Kimura (1953) in which dispersal is only between orthogonally-neighboring stepping stones. Longer distance gene flow is the cumulative result of these single steps over generations. To realize this model, we set dispersal weights equally to 1 and smax to 0. After running for 49 generations, the resulting gene frequencies closely resemble those produced by isolation by distance in a continuous model. 4. Stepping stone plus migration. In this model immigrant gene frequencies replaced native ones in the top five rows of the lattice for each of 49 generations and a preferential weight for dispersal from the top (see Table 1 for the weights) diffused immigrant genes from north to south. 5. Stepping stone plus two migrations. After the first 49 generations of running the
80
ROBERT R. SOKAL ET AL
program as for scenario 4 above, the surfaces were rotated 90° counter-clockwise, resulting in west-to-east diffusion gradients. These surfaces were then furnished as input to a new sequence of 49 generations of migration, with immigrant genes again diffusing from the top (north). The exact shape of the resulting gene-frequency surface cannot be easily predicted because it depends on both the sign and the magnitude of the immigrant gene-frequency differentials of the two migration processes. 6. Stepping stone plus selection. Here the dispersal parameters were equally weighted in each direction and maximal selection operated against the mutant in the bottom row. Since the truncated exponential distribution from which the sUL values are sampled will yield quite low selection coefficients most of the time (see Figure 2A) and substantial coefficients near smax only rarely, it would be difficult, if not impossible, to discriminate stepping-stone-plus-selection from the plain stepping-stone scenario. Therefore, we included one surface with a higher expected value of sUL among the 15 surfaces of each dataset (described below). We obtained the surface by sampling a selection coefficient at random from a uniform distribution between 0.03 and 0.05 for smax = 0.05 (weak parameter) or between 0.15 to 0.20 for smax = 0.20 (strong parameter). A sample frequency distribution of these coefficients is shown in Figure 2B. 7. Stepping stone plus two selections. We simulated the effect of two simultaneous, perpendicular gradients applied to different alleles. The simulation proceeded as in scenario 6 above, except that two surfaces were treated with the higher sUL values from the uniform distribution. After the 49th generation, 7 of the 15 surfaces of each dataset (explained below), including one with intentionally higher sUL, were rotated 90° counter-clockwise so that approximately half of the selection gradients coursed right to left (east to west) and the other half bottom to top (south to north). This avoided two coincident selection gradients, which, even when they act on different loci, would not be distinguishable from a single selection gradient operating on two loci. 8. Stepping stone plus migration plus selection. Both migration and selection were turned on simultaneously for 49 generations. However, to separate the directions of the migrational and selectional trends, selection was strongest along the rightmost column, yielding an east-to-west decreasing gradient. If these trends were coincident, selection could not readily be distinguished from migration. 9. Intrusion. This scenario was intended to imitate the long-distance, one-time migration of an immigrant population into an area originally settled by natives who are displaced. It differs from the migration examples discussed earlier in that the only diffusion into the native population takes place along its new boundary through dispersal effects. We divided the 21 3 21 lattice into three areas by setting up a Y-shaped boundary (as shown in Fig. 3). We then replaced the initial population within the V-shaped portion of the Y by immigrants. The subsequent 49 generations featured only dispersal and reproduction. The dispersal resulted in a fuzzing of an initially sharp boundary between the immigrants and the natives. (This assumes, of course, that there was a sharp boundary initially. This would happen only if the gene-frequency differential between immigrants and natives was appreciable).
SPATIAL AUTOCORRELATION ANALYSIS
81
Figure 3. The full 21 3 21 lattice, on which simulations were carried out, divided into three regions. The inner 7 3 7 lattice that was actually analysed is also divided into three regions of approximately equal numbers of lattice points.
Assembling the datasets For each of the nine scenarios, we assembled a master population of 1000 surfaces, generated with the same parameters but with random seeds. From these master populations we sampled, with replacement, to obtain the 15 surfaces that constitute a single dataset. We intended to have each dataset represent typical real-world genefrequency data. The 15 surfaces simulate 15 different biallelic loci. All scenarios involving selection also had master populations of 500 surfaces with the forced, stronger selection coefficient. These were used as sampling universes for the single, stronger selection surface representing each selection trend. Number of replicates, training and test sets We sampled 250 replicated datasets for each scenario and for both weak and strong parameters. Thus, altogether we generated 4500 datasets (250 3 9 3 2) known as the training set used to develop the discriminant functions discussed later in this account. Because discriminant functions tend to fit themselves to the idiosyncrasies of a particular dataset, we tested the efficacy of the discriminant function on new datasets called the test sets. Construction of this test set gave us a total of 9000 datasets of 441 ( = 21 3 21) populations (stepping stones) and 15 gene frequencies each. Analysis of results For each of the 9000 datasets, we calculated a series of population-genetic and spatial statistics. We considered only the central 7 3 7 lattice to avoid the complications of edge effects in the larger 21 3 21 lattice. Our computations were therefore carried out on 49 localities (stepping stones) for each allele-frequency
82
ROBERT R. SOKAL ET AL
surface. For each dataset, we computed 109 statistics divided into two main groups. Seventy statistics were routine summary statistics, while the other 39 were specifically designed to simulate the thought processes of the analyst studying such data. Twentysix of the statistics derived from one-dimensional spatial autocorrelation analysis (Sokal & Oden, 1978a,b) and 44 from two-dimensional spatial autocorrelation analysis (Oden & Sokal, 1986), correlation of surface gene frequencies, and other statistics. They included statistics such as the averages of the 15 values of the spatial autocorrelation coefficient I (one average for each of the 10 (one-dimensional) or 21 (two-dimensional) distance classes of the study), the standard deviations of these 15 I values (for each of the distance classes), the average of the I values over all distance classes and correlograms, and the average intercept and monotonicity of the 15 correlograms. The remaining 39 variables were developed in an attempt to imitate the somewhat subjective procedures developed in our laboratory for interpreting the populationbiological processes underlying spatial phenomena (Sokal, 1984, 1986). We asked 12 questions about the data. (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)
Is there spatial structure? Is there heterogeneity among locality samples Are the one-dimensional correlograms homogeneous? What is the shape of each group of one-dimensional correlograms? How many groups of one-dimensional correlograms are there? Are the two-dimensional correlograms homogeneous? What is the shape of each group of two-dimensional correlograms? How many groups of two-dimensional correlograms are there? Are there any high absolute correlations among allele-frequency surfaces? Are the absolute correlations among surfaces homogeneous? How many groups of surfaces are there? How well do the clusters determined by the correlograms match the clusters determined by the surfaces?
These 39 variables were, in part, attempts to quantify visual impressions of correlograms and cluster analyses. Among the 109 variables employed, there is only one that is not strictly related to our established procedures of spatial autocorrelation analysis. This is AVG-FST, the total FST (Nei, 1987) for the two alleles averaged over the 15 surfaces. All other variables are either summaries of spatial autocorrelation statistics or measures taken to respond to the 12 questions posed above. We do not list each variable in detail, but furnish below (Table 5) a list of the 20 variables that we found to be of value in discriminating among the nine microevolutionary scenarios. We subjected the nine groups of 250 datasets to linear discriminant analysis, using the SAS program DISCRIM (SAS Institute, 1989). We also ran a stepwise procedure STEPDISC (SAS Institute, 1989) to obtain the best discriminant functions for 5, 20 and 50 variables.
RESULTS
Table 2 lists the overall misclassification rates (labelled raw rates) of the
SPATIAL AUTOCORRELATION ANALYSIS
83
TABLE 2. Overall misclassification rates (in percent) of the discriminant functions for different numbers of variables Raw rates No. variables
Adjusted rates*
Training set
Test set
Training set
Test set
Weak parameters
5 20 50 109
39.87 32.67 26.44 23.07
44.18 39.29 38.49 41.87
7.42 3.87 3.02 2.71
11.33 7.11 7.51 8.84
Strong parameters
5 20 50 109
16.67 3.91 2.36 1.69
16.63 3.60 3.33 3.20
9.73 0.67 0.31 0.13
10.67 0.62 0.84 0.80
*Adjusted for five groups comprising scenarios 1, 2, (3,6,7), (4,5,8), 9.
discriminant functions for different numbers of variables. As expected, the misclassification rate decreases as the number of variables increases. This is true for both weak and strong parameters. Furthermore, the error rates for test sets are generally higher than those of the training sets and the error rates for the weak parameters are considerably higher than for the strong parameters. Under the null hypothesis of random allocation of the datasets to the nine scenarios, we would obtain an overall error rate of 88.89%. All our error rates, even the worst ones for five variables and weak parameters, are substantially lower than that. The discriminant function produces a meaningful classification of the datasets based on some or all of the 109 spatial statistics. Tables 3 and 4 show the detailed classification results based on the 20-variable, stepwise discriminant functions applied to the datasets with weak and strong parameters, respectively. The discriminant function for the strong parameters is almost perfect; its percent misclassification error is only 3.60%. For the datasets based on weak parameters, the error rate is far higher, 39.29%, but this is still far below the expected misclassification rate under random allocation. When we examine Table 3, we note that scenarios 1 and 2, panmixia and drift, are perfectly discriminated, even though these are the weak parameter datasets. In TABLE 3. Percent classification of the 9 microevolutionary scenarios for the weak parameter test set based on the 20-variable stepwise discriminant function Classified as Actual scenario 1 2 3 4 5 6 7 8 9 NOTE:
1 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
2 0.00 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
3
4
0.00 0.00 40.80 0.00 0.00 35.60 35.20 0.40 2.00
0.00 0.00 0.80 47.20 10.40 0.00 10.40 37.20 4.00
The scenarios are identified in Table 1.
5 0.00 0.00 0.00 5.60 78.40 0.00 0.40 10.80 0.40
6
7
0.00 0.00 17.60 1.20 0.40 23.20 10.40 0.40 6.00
0.00 0.00 37.20 0.00 0.40 36.80 35.60 2.00 5.60
8 0.00 0.00 1.60 43.20 9.60 1.20 8.00 45.20 6.00
9 0.00 0.00 2.00 2.80 0.80 3.20 0.00 4.00 76.00
84
ROBERT R. SOKAL ET AL
TABLE 4. Percent classification of the 9 microevolutionary scenarios for the strong parameter test set based on the 20-variable stepwise discriminant function Classified as Actual scenario 1 2 3 4 5 6 7 8 9 NOTE:
1 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
2 0.00 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
3 0.00 0.00 100.00 0.00 0.00 0.00 0.00 0.00 0.00
4 0.00 0.00 0.00 98.00 2.40 0.00 0.00 2.00 0.00
5 0.00 0.00 0.00 0.00 95.20 0.00 0.00 15.60 0.00
6
7
0.00 0.00 0.00 1.60 0.00 99.20 3.60 0.00 3.60
0.00 0.00 0.00 0.00 0.00 0.40 96.40 0.00 0.00
8 0.00 0.00 0.00 0.00 2.40 0.00 0.00 82.40 0.00
9 0.00 0.00 0.00 0.40 0.00 0.00 0.00 0.00 96.40
The scenarios are identified in Table 1.
fact, further inspection of the data showed that these two classes could be discriminated by single, invariant variables, AVGFST and I2D1. The former is the average FST-value for the 15 surfaces; the latter is the average autocorrelation of the innermost distance/direction class of the 2-dimensional correlogram. For the strong parameter sets, several variables (including these two) were needed to discriminate scenarios 1 and 2, but the two scenarios provide no challenge to the discriminant function. In the rest of the scenarios, for the weak parameter data, Table 3 shows that the error patterns are not evenly distributed. Scenarios 3, 6 and 7 (stepping stone, stepping stone + selection, and stepping stone + 2 selections) are often misclassified, one into the other, as are scenarios 4, 5 and 8. Selection under the weak parameter regime is too weak to be clearly detected, so the discriminant function program mistakes many of the selection datasets for the simple stepping-stone scenario. Scenarios 4, 5 and 8 are stepping stone + migration, stepping stone + 2 migrations, and stepping stone + migration + selection. We have already seen that selection is often not recognized, so the program considers stepping stone+ migration + selection to be simply a stepping stone + migration dataset. The presence of migration is clearly distinguished from the simple stepping-stone model, but the different migration examples cannot easily be distinguished under weak selection. If we are satisfied with a classification that identifies stepping-stone models (with or without selection), and migration models (without distinguishing two migrations from a single migrational event), then the results are fairly satisfactory. This leaves scenario 9, the intrusion, which has the fourth highest percentage of correct classification for the weak parameters (Table 3) and is almost perfectly discriminated in the strong parameters (Table 4). In Table 2, the last two columns give adjusted rates. These are the misclassification rates for five classes: 1, 2, (3, 6, 7), (4, 5, 8), 9. The adjusted misclassification rates are considerably lower than the raw rates. Even this rough grouping conveys useful information. In Table 4 there is still a residual indication of the tendency to confound scenarios 3, 6 and 7 and scenarios 4, 5, 8, but the overall classification of the strong parameter dataset is so good that the remaining errors really do not matter. We ran a variety of quadratic and nonparametric discriminant functions on the subset of our weak parameter data containing scenarios 3–8, in the hope of obtaining a better discriminant function, but we were not successful. It took the strengthening
SPATIAL AUTOCORRELATION ANALYSIS
85
of the parameters (increase in effective population size and increase in preferential dispersal from the north and in selection coefficients) to bring about a clear separation among the scenarios in each of the two clusters of scenarios. The variables that comprise the 20-variable discriminant function for both weak and strong parameters are shown in Table 5. There is substantial overlap; 11 of the 20 variables are common to the two studies. The 5-variable discriminant functions for both weak and strong parameters consist of variables ranked 1 through 5 in Table 5 . The majority of the 20 variables come from the set of 70 straightforward, spatial statistical variables, rather than the more intricate variables devised to reply to the questions posed during spatial autocorrelation analysis. Thus, plain spatial autocorrelation statistics seem sufficient to provide most of the information. TABLE 5. Ranks of the top 20 variables chosen by stepwise discriminant analysis for both weak and strong parameters. The 39 descriptive variables for spatial autocorrelation analysis are distinguished from the 70 spatial statistics by the presence of an asterisk Variable I2D1 AVGG* AVGVAR I2D16 KMD1* AVGIBAR I2D3 I2D13 GFPROB* AVGINT I2 I2D20 I2D10 SD2D1 SD2D7 PANMIXN* FPROB1D* I1 I2D14 SD2D18 AVGFST AVGG AVGIBAR AVGINT AVGVAR AVGVAR2D CORRSURF FPROB1D
Weak
Strong
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
1 18 3 — — 20 6 4 17 — — 15 — — 14 — 19 9 — —
Variable AVGFST* MAXMD2D* I10 SD10 SD1 AVGVAR2D KM12D* CORRSURF VARACOR*
Weak
Strong
— — — — — — — — —
2 5 7 8 10 11 12 13 16
average of the total FST for the two alleles over the 15 surfaces. average G-statistic over the 15 allele-frequency surfaces. average of the I-values over the 10 distance classes and 15 one-dimensional correlograms. average X-intercepts of 15 one-dimensional correlograms. average of 15 variances of I-values calculated per one-dimensional correlogram. average of 15 variance of I-values calculated per 2-dimensional correlogram. average of 105 correlations between allele-frequency surfaces. combined probability (by Fisher’s method) of the 15 Bonferroni probabilities associated with the 15 one-dimensional correlograms. GFPROB combined probability (by Fisher’s method) of the 15 G-statistics for the allele-frequency surfaces. Ii average of 15 I-values for distance class i in the one-dimensional correlograms. I2Di average of 15 I-values for distance class i in the 2-dimensional correlograms. KMD1 sum of squares resulting from the k-means clustering of one-dimensional correlograms. KMD2 sum of squares resulting from the k-means clustering of 2-dimensional correlograms. MAXMD2D maximum pairwise Manhattan distance among the 15 2-dimensional correlograms. PANMIXN number of the cluster of one-dimensional correlograms exhibiting a characteristic panmixia shape. SDi standard deviation of 15 I-values for distance class i in the one-dimensional correlograms. SD2Di standard deviation of 15 I-values for distance class i in the 2-dimensional correlograms. VARACOR the variance of the 105 absolute correlations between allele-frequency surfaces.
86
ROBERT R. SOKAL ET AL
Figure 4. A, map of one surface from a sample test dataset representing a stepping-stone scenario. Sample taken from one of 25 test datasets used in the subjective analysis. B. map of one surface from a sample test dataset representing a migrational scenario. Sample taken from one of 25 test datasets used in the subjective analysis.
To compare our discriminant function results with those derived from subjective interpretation of spatial statistics, we randomly chose 25 datasets from the 2250 test datasets for the strong parameters. In an effort to eliminate any behind-the-scenes knowledge of the structure of these datasets, each was rotated at random 0°, 90°, 180°, or 270° and flipped randomly by rows, columns, both, or not at all. Thus,. the subjective interpreter had no prior knowledge of the direction in which the trends (clines) ran. The following information was furnished for each dataset: averages of FST and the G-statistic, maps of the 15 surfaces (for two examples, see Fig. 4), onedimensional correlograms for all 15 surfaces (for two examples, see Fig. 5), 15 twodimensional correlograms (for two illustrations, see Fig. 6), clusterings of the surfaces, and clusterings of the one-dimensional and the two-dimensional correlograms. The datasets were prepared in consultation between NLO and BAT, while their true identities (the scenarios which they represent), how many scenarios were sampled, and how many replicates of each scenario were chosen, were carefully kept from RRS, who then examined each of these 25 datasets and classified each into one of the nine scenarios, based on his subjective interpretation of the analytical results. Table 6 shows the results of this study. RRS correctly identified the scenarios of 17 (68%) of the 25 datasets, a considerable improvement over the 11.1% that is expected on random allocation. Of the eight errors, two were correctly identified in alternative interpretations made before the true identities of the datasets were revealed. Five of the datasets wrongly identified by the subjective analyst were also wrongly allocated by the 5-variable discriminant function. Thus, these appear to be challenging datasets, although the 20-variable discriminant function correctly identified all 25 of the randomly chosen datasets.
DISCUSSION AND CONCLUSIONS
Information about the generating processes that led to our nine different scenarios is contained in spatial statistics. Can this information be used in real world situations to identify the nature of evolutionary processes? The spatial autocorrelation analysis cannot inescapably lead to the correct evolutionary interpretation, but it can at least provide evidence for or against specific alternative hypotheses. By analysing a minimum of 12 to 15 allele frequencies, we increase the likelihood of finding at least some allele frequencies for which the immigrant gene-frequency differential is substantial, or for which selection coefficients are large enough to produce a detectable pattern.
SPATIAL AUTOCORRELATION ANALYSIS
87
Sources of variation in spatial autocorrelation Slatkin & Arter (1991) discuss four sources of variation in spatial population-genetic data. The first is sampling variation that results from the Mendelian mechanism during reproduction, e.g. genetic drift. The second source they call stochastic variation; it is the realization error of the generating process, which, with identical parameters, yields different results at different times and locations for the same allele, or for different alleles subject to the same parametric values. The third source of variation is parametric variation, the variation in mutation rates and selection coefficients; we add to this differences in immigration differentials for different allele frequencies. Finally, the fourth source of variation is the initial gene frequencies, important in studies of recently evolved populations not at or near equilibrium. Of our nine scenarios, all embody sampling variation, as well as initial gene frequency
0.6 A 0.4
Moran's I
0.2 0 –0.2 –0.4 –0.6 –0.8
0
1
2
3 4 6 7 5 Upper limits of distance classes
8
9
1
2
3 4 6 7 5 Upper limits of distance classes
8
9
0.6 B 0.4
Moran's I
0.2 0 –0.2 –0.4 –0.6 –0.8
0
Figure 5. A, one-dimensional correlogram for the surface in Figure 4A. B, one-dimensional correlogram for the surface in Figure 4B.
88
ROBERT R. SOKAL ET AL
variation. Scenarios 1 and 2, panmixia and drift, are limited to these sources of variation. Scenario 3, the stepping-stone model, also depends on stochastic variation since it incorporates a realization error. All other scenarios in our study reflect all four sources of variation mentioned by Slatkin & Arter. In our scenarios we ignored mutation because, unlike Slatkin & Arter, we initiated our populations at intermediate gene frequencies, and felt that real-world mutation rates would have little effect in only 49 generations. Slatkin & Arter (1991) were concerned that the presence of stochastic variation and of parametric variation would confound the patterns and make it difficult to infer processes from patterns by means of spatial autocorrelation methods. This was not the case in these simulations. The discriminant function analyses and the subjective method were successful despite all four sources of variation. We feel that the concerns expressed by Slatkin & Arter are not critical. Unlike the study of Slatkin & Arter (1991), who simulated populations at or near equilibrium, our simulations terminated after relatively few generations (49) for all
(A)
(B)
Figure 6. A, two-dimensional correlogram for the surface in Figure 4A. The sectors of this ‘windrose’ represent distance-direction classes. Successive annuli are equal interval distances of 1.5. Shading ranges from black for strong positive to white for strong negative. Sectors with insufficient replication are not shown. B, two-dimensional correlogram for the surface in Figure 4B. Other comments as in Figure 6A.
SPATIAL AUTOCORRELATION ANALYSIS
89
TABLE 6. Outcome of the subjective analysis of 25 datasets True scenario 2 2 2 2 3 3 3 3 3 4 4 4 4
Subjective analysis 2 2 2 2 3 3 6 (3) 3 3 4 4 4 4
Order of presentation 7 11 18 21 1 2 3 4 12 8 9 10 15
True scenario 4 5 5 5 6* 6* 6* 7 7 8 9* 9*
Subjective analysis 4 5 5 5 4 4 (8) 4 (6) 5 7 5 8 4
Order of presentation 16 19 23 25 13 17 22 6 14 5 20 24
NOTES: Data were randomly selected from the strong parameter test set. They are ordered here by scenario but were presented in random sequence (shown in column 3) to the subjective interpreter. Three numbers in parentheses are alternative classifications suggested by the analyst, who was not certain of his first interpretation. *Scenario misclassified by 5-variable discriminant function.
scenarios except 5, stepping stone plus two migrations, which ran for 98 generations. We did this deliberately because most of the examples analysed in the literature are for populations unlikely to have been subject to constant environmental conditions for periods of 500 or 1000 generations. It might be useful to point out that the autocorrelation methods employed in this study will typically assay the population parameters that have been active in the recent or even not so recent past. If, at the time of sampling, there has just been a change in parameters, this change would not be picked up until a sufficient number of generations has passed for that change to make its effect on the gene-frequency surface. As Slatkin & Arter (1991) point out, recently evolved populations are subject to an additional source of variation — their initial frequencies and initial differences in spatial distributions. One allele might initially have the same frequency everywhere, while another might have a steep cline. These authors worried whether this might make inferences about spatial autocorrelational analysis more problematical. However, in our work we created situations in which some immigrant allele frequencies were essentially the same as the native frequencies, leading to a situation of similar frequencies everywhere. In other cases, in the same dataset, the immigrant allele-frequency differential was great, leading to a steep cline. Despite the wide variation in initial gene frequencies resulting from sampling the actual gene frequencies of the three human populations, we seem to be identifying patterns in substantial numbers in the weak parameter study and with outstanding success in the strong parameter study. Spatial autocorrelation coefficients such as Moran’s I are expressions of the similarities between neighboring data points. These similarities may arise either because of differences across space in the underlying expected value of the surface (caused by fixed structures such as islands, peninsulas, mountain ranges, population enclaves or selection gradients) or because of true autocorrelational phenomena, such as the gene flow that causes ‘waves’ in a pure isolation by distance surface. Conceptually, these two aspects of spatial pattern are distinguishable because
90
ROBERT R. SOKAL ET AL
features caused by fixed structures will recur at the same locations in independent replicates of the spatial surface, while features due to pure autocorrelation will vary. In any case, Moran’ I cannot, by itself, distinguish between these phenomena, and so reports them all. If the links between point-pairs in a given distance class correspond to a known fixed structure, then the autocorrelation coefficient can be viewed as a summary of data similarities arising from that structure. If the links are distributed widely over the area, then similarities are more likely due to an autocorrelation process. Our simulations make the simplifying assumption that selection gradients are always clinal. In the real world some selection patterns will be more complicated. If we know the nature of the selective agent and can measure it at various stations on the gene-frequency surface, we can carry out specific tests for such selection trends. However, in the exploratory mode of most gene-frequency variation studies, one needs to simplify trends considerably. We will not be able to detect selection unless it affects only one or a few allele frequencies and unless its pattern of distribution is distinct from other clines caused by gene flow. In this respect selection can actually obscure patterns caused by other evolutionary forces if it coincides or largely overlaps with them. Weak and strong parameters The weak-parameter discriminant functions performed less well than did the strong-parameter ones, whose percentage misclassification error rates were minimal (see Table 2), and the different scenarios did not respond equally to the changed conditions. Migration scenarios were distinguished quite well, even in the weak parameter data, whereas selection was not recognized. In our effort to make the data as realistic as possible, our selection coefficients for most loci were quite low (Figure 2A). Many values for the upper limit of the selection coefficient under weak parameters are therefore likely to fall below 0.01, yielding an Ns < 0.64. By contrast, even in the weak parameter experiment, the product Nm is 7.5, leading to a relatively more deterministic outcome. If we are satisfied with inferring simple migration, without being able to tell whether it was accompanied by selection or migration in a second direction, the adjusted error in Table 2 shows quite acceptable misclassification error percentages for the weak parameters; for 20 variables, it is only 7.11%. Were the strong parameters too strong to be found in real biological populations? Our gene-frequency differentials occur in real populations, albeit not in contiguous ones. Effective population sizes as low as 64 might be relatively rare; our strong parameter simulation, with an effective population size of 448, might be more realistic in many human situations. When combined with the migration coefficient we obtain values of Nm = 7.5 for the weak parameters, and 52.5 for the strong parameters. The lower value falls well within the range estimated for 16 different species by Slatkin (1985). The upper value exceeds the highest estimate reported by Slatkin (Nm = 42.0 for Mytilus edulis), who did not examine any human data. Many selection coefficients are undoubtedly quite low, approaching neutrality. Results of such selection probably will not be picked up by spatial autocorrelation or by any other method. Our analyses detected a few instances of selection with higher selection coefficients. If the selection coefficients had been consistently higher,
SPATIAL AUTOCORRELATION ANALYSIS
91
selection would have been detected invariably. Endler (1986) reports 566 selection coefficients in 36 species (including humans) ranging between 0 and 1, so values higher than those employed even in the strong parameter simulations are possible. From Tables 3 and 4 we can see that scenarios 1 and 2 were always correctly classified and separated from the other patterns. There is no autocorrelation in these data, which alone separates the groups infallibly. The groups with selection are difficult to tell apart from those that have only the basic stepping-stone model, except for the strong parameter case, where they are easily separated. The examples with migration, by contrast, are clearly separated from those of the pure stepping-stone model with either set of parameters. Another easily distinguished pattern is intrusion, which yields characteristic correlograms when the immigration differential is substantial. Overall, selection is the hardest case to distinguish. Since migration is at a constant rate, one might expect it to lack parametric variation, but because the immigration differential varies from allele to allele, the resulting product of migration and dispersal shows a good deal of fluctuation and yields the equivalent of parametric variation, just as for the selection coefficients. Immigration differentials create the same effect as varying migration coefficients. Practical considerations Slatkin & Arter (1991) list three applications of spatial autocorrelation: the detection of hidden geographic variation, trends, and patterns; tests of significance of spatial patterns; and, inferences about microevolutionary processes from contemporary gene-frequency patterns. These authors concur that the method is useful in the first two instances, but doubt its power for the third. However, our results clearly show that our methods of spatial autocorrelation analysis will in many cases lead to useful inferences about generating processes. The discriminant analysis to which we devoted considerable time and journal space cannot serve as a model for spatial autocorrelation analysis in the real world. In any actual example, we will not be limited to nine known scenarios and will not have these replicated numerous times so that we can apply discriminant functions to them. The main purpose of employing discriminant functions in this study was to demonstrate, in an economical way, that spatial autocorrelation analysis furnishes information capable of resolving groups affected by different major evolutionary factors. The discriminant analysis suggests the lowest misclassification rates for these data. In actual cases, the analyst will use exploratory procedures similar to those of the subjective approach in this study, followed by more direct tests of the postulated model. Our subjective analysis, while more labour-intensive than the discriminant function analysis, has more relevance to actual datasets and suggests a more realistic misclassification rate, 32%. This rate was higher than the 5-variable discriminant function (16.63%, Table 2), but the error rate of the discriminant function for the sampled 25 datasets (20%) was also higher than 16.63%. Below we discuss the nature of the errors, possible reasons for the errors, and how relevant these would be in other studies. It is clear that limiting the alternative hypotheses to nine possible evolutionary scenarios is not feasible in the real world where a multitude of combinations of evolutionary factors may be at work, some of them not yet imagined. Our scenarios
92
ROBERT R. SOKAL ET AL
outline the majority of classes of phenomena our methods are capable of exploring. Except for mutation, we have allowed for the main evolutionary forces: random mating, genetic drift, effective population size, migration (both local dispersal and long-distance migration), and selection. Even if the fine details of an actual scenario differ from the schematic outlines we simulated (e.g. complicated structures for gene flow or selective surfaces), we can assign major responsibility to one or two sets of factors and then achieve a better parameterization of the factor(s) in question with tests of more refined models. Subjective errors investigated Five of the eight datasets the subjective analyst identified incorrectly were cases in which he interpreted selection as migration. The key to recognizing selection is the presence of a trend (a cline) in a unique direction manifested in only one or a very few allele frequencies; for migration, clines should be found in many allele frequencies. Under the strong parameter regime, more allele frequency surfaces experience selection coefficients that are detectable, giving rise to the perception that these surfaces represent migration. In nature, most selection coefficients are unlikely to be as high as in this study, so the criterion ‘repeated trend = migration, unique trend = selection’ is more likely to be correct. Another two of the eight misclassified datasets were examples of intrusions that were misclassified by both the subjective analyst and the 5-variable discriminant function. The analyst had difficulty perceiving this pattern in the surfaces. Whereas in the full 21 3 21 lattice the V-shaped intrusion (Fig. 3) is very noticeable if the immigrant gene-frequency differential is sufficiently large, in the inner 7 3 7 sublattice, the V is flat, and, after 49 generations of dispersal generating gene flow, is only fuzzily demarcated. It is difficult for the subjective interpreter to distinguish this model from an ordinary migration cline which is how the two erroneous datasets were recorded. Note that the 5-variable discriminant function missed these two as well, although the 20-variable function correctly identified them. Finally, the eighth misidentified dataset is a pure stepping-stone model which exhibited a few apparently randomly engendered clines, leading to an inference of selection when it does not exist. This phenomenon, akin to Type I error of the null hypothesis of stepping stone without mutation or selection, has been noted previously (Sokal & Jacquez, 1991; Slatkin & Arter, 1991). Purely stochastic data may generate apparent patterns at random. By chance, none of the 25 datasets came from scenario 1 (panmixia), but a combination of nonsignificant G-values for the surfaces, together with absence of autocorrelation, should have sufficed to clearly distinguish all instances of panmixia.
ACKNOWLEDGEMENTS
Contribution No. 963 in Ecology and Evolution from the State University of New York at Stony Brook. This research was supported by grant DEB9220538 from the National Science Foundation to Robert R. Sokal. The extensive computations were made possible by a grant of supercomputer time from the Cornell Theory Center.
SPATIAL AUTOCORRELATION ANALYSIS
93
We appreciate constructive comments by Monty Slatkin on an earlier version of this manuscript.
REFERENCES Barbujani GB, Sokal RR. 1991. Genetic population structure of Italy. I. Geographic patterns of gene frequencies. Human Biology 63: 253–272. Endler JA. 1986. Natural selection in the wild. Princeton, New Jersey: Princeton University Press. Falsetti AB, Sokal RR. 1993. Genetic structure of human populations in the British Isles. Annals of Human Biology 20: 215–229. Kimura M. 1953. “Stepping stone” model of population. Annual Report of the National Institute of Genetics, Japan 3: 63–65. Nei M. 1987. Molecular evolutionary genetics. New York: Columbia University Press. Oden NL, Sokal RR. 1986. Directional autocorrelation: An extension of spatial correlograms to two dimensions. Systematic Zoology 35: 608–617. Roychoudhury AK, Nei M. 1988. Human polymorphic genes: world distribution. New York: Oxford University Press. SAS Institute. 1989. SAS/STAT user’s guide, version 6, 4th edition. Cary, North Carolina: SAS Institute. Slatkin M. 1985. Rare alleles as indicators of gene flow. Evolution 39: 53–65. Slatkin M, Arter HE. 1991. Spatial autocorrelation methods in population genetics. American Naturalist 138: 499–517. Sokal RR. 1979. Ecological parameters inferred from spatial correlograms. In: Patil GN, Rosenzweig ML, eds. Contemporary quantitative ecology and related ecometrics. Fairland, Maryland: International Co-operative Publishing House, 167–196. Sokal RR. 1984. Spatial analysis in population biology and regional science. In: Andersson AE, Isard W, Puu T, eds. Regional and industrial development theories, models, and empirical evidence. Amsterdam: North-Holland, 241–266. Sokal RR. 1986. Spatial data analysis and historical processes. In: Diday E, Escoufier Y, Lebart L, Pages J, Schektman Y, Tomassone R, eds. Data analysis and informatics IV. Amsterdam: North-Holland, 29–43. Sokal RR, Harding RM, Lasker GW, Mascie-Taylor CGN. 1992. A spatial analysis of 100 surnames in England and Wales. Annals of Human Biology 19: 445–476. Sokal RR, Harding RM, Oden NL. 1989a. Spatial patterns of human gene frequencies in Europe. American Journal of Physical Anthropology 80: 267–294. Sokal RR, Jacquez GM, Wooten MC. 1989b. Spatial autocorrelation analysis of migration and selection. Genetics 121: 845–855. Sokal RR, Jacquez GM. 1991. Testing inferences about microevolutionary processes by means of spatial autocorrelation analysis. Evolution 45: 152–168. Sokal RR, Lengyel IA, Derish PA, Wooten MC, Oden NL. 1987a. Spatial autocorrelation of ABO serotypes in medieval cemeteries as an indicator of ethnic and familial structure. Journal of Archaeological Science 14: 615–633. Sokal RR, Oden NL, Barker JSF. 1987b. Spatial structure in Drosophila buzzatii populations: simple and twodimensional spatial autocorrelation. American Naturalist 129: 122–142. Sokal RR, Oden NL. 1978a. Spatial autocorrelation in biology 1. Methodology. Biological Journal of the Linnean Society 10: 199–228. Sokal RR, Oden NL. 1978b. Spatial autocorrelation in biology 2. Some biological implications and four applications of evolutionary and ecological interest. Biological Journal of the Linnean Society 10: 229–249. Sokal RR, Oden NL. 1991. Spatial autocorrelation analysis as an inferential tool in population genetics. American Naturalist 138: 518–521. Sokal RR, Smouse PE, Neel JV. 1986. The genetic structure of a tribal population, the Yanomama Indians. XV. Patterns inferred by autocorrelation analysis. Genetics 114: 259–281. Sokal RR, Wartenberg DE. 1981. Space and population structure. In: Griffith D, McKinnon R, eds. Dynamic spatial models. Alphen aan den Rijn, The Netherlands: Sijthoff and Noordhoff, 186–213. Sokal RR, Wartenberg DE. 1983. A test of spatial autocorrelation using an isolation-by-distance model. Genetics 105: 219–237.