Assessing relative abundances in fossil assemblages

Assessing relative abundances in fossil assemblages

Palaeogeography, Palaeoclimatology, Palaeoecology 253 (2007) 317 – 322 www.elsevier.com/locate/palaeo Assessing relative abundances in fossil assembl...

138KB Sizes 0 Downloads 52 Views

Palaeogeography, Palaeoclimatology, Palaeoecology 253 (2007) 317 – 322 www.elsevier.com/locate/palaeo

Assessing relative abundances in fossil assemblages Jason R. Moore a,⁎, David B. Norman a , Paul Upchurch b a

Department of Earth Sciences, University of Cambridge, Downing Street, Cambridge, CB2 3EQ, UK Department of Earth Sciences, University College London, Gower Street, London, WC1E 6BT, UK

b

Received 7 December 2006; received in revised form 4 June 2007; accepted 6 June 2007

Abstract The relative abundances of taxa or skeletal elements in a fossil assemblage can provide important information concerning the palaeoecology and taphonomy of the assemblage. However, these relative abundances must be estimated from samples of the assemblage, rather than measured directly. The sampling error this produces decreases the accuracy with which relative abundances can be estimated from the fossil record. Using the multinomial distribution it is possible to place constraints on the accuracy of estimation of relative abundance, provided that two out of three key parameters (sample size, required degree of similarity and confidence level) are known. Applying this methodology to the fossil record it can be shown that in order to be 95% confident the taxon relative abundances of a fossil assemblage lie within 5% of those found in a sample, 534 individuals must be collected. This methodology enables the assessment of published relative abundance estimates and the development of sampling protocols for future studies. © 2007 Elsevier B.V. All rights reserved. Keywords: Palaeoecology; Taphonomy; Analytical methods; Diversity; Relative abundance

The relative abundances of taxa or skeletal parts in a fossil assemblage, vertebrate or invertebrate, are some of the easiest parameters to calculate but some of the most difficult to interpret with any confidence. Many workers have attempted to draw palaeoecological conclusions from the relative abundances of taxa (for example Shotwell, 1955, 1958; Clark et al., 1967; Wolff, 1975; Farlow, 1976; Bakker, 1972; Kumar, 1992). However, more recent comparisons of the abundances of vertebrate and invertebrate taxa between biocoenoses (ecological communities) and thanatocoenoses (death assemblages) have demonstrated that both taphonomic and sampling biases can cause large differences between ⁎ Corresponding author. E-mail address: [email protected] (J.R. Moore). 0031-0182/$ - see front matter © 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.palaeo.2007.06.004

these stages (Behrensmeyer et al., 1979; Staff et al., 1985, 1986; Bennington and Bambach, 1996; Kidwell, 2001). Were it possible to interpret fossil assemblage taxon/ part relative abundance with confidence the precise nature of taphonomic biases affecting fossil assemblages or the dominance and community structure of fossil assemblages could be investigated. Comparisons with modern communities, environments or processes would then be facilitated, opening further significant avenues for palaeoecological and taphonomic research. However, at present such investigations are limited by two major factors, the influence of taphonomic biasing and the accuracy with which the parameters of an assemblage can be described by a necessarily small sample. In this paper we aim to address the second of

318

J.R. Moore et al. / Palaeogeography, Palaeoclimatology, Palaeoecology 253 (2007) 317–322

these questions, providing methodology to assess sampling effects on the relative abundances of objects (taxa or skeletal elements) in a fossil assemblage. For the remainder of the paper, we will refer to species abundances as these are the most commonly analysed abundances in the palaeontological literature. However, it should be emphasised that this methodology is applicable to any sample that consists of a series of relative abundances, no matter what property/taxon/hypothesis is under investigation. Sampling effects are the random variations introduced into statistical population parameters, as they are estimated from a sample of a population, rather than directly measured. Various statistical techniques have been developed to compensate for sampling effects in different situations (see Hayek and Buzas, 1997 for a summary). The most commonly used methods to estimate the sampling effects on species abundances are those based on the binomial distribution (including Dennison and Hay, 1967 and Fatela and Taborda, 2002) and those using multiple samples to generate cluster confidence intervals around species abundances (Buzas, 1990, Bennington and Rutherford, 1999). These methods provide comparable results when individual species abundances are low, but with higher abundance species (greater than 0.1) more accurate estimates of the error in measured abundances are obtained using cluster confidence intervals (Buzas, 1990). Calculating cluster confidence intervals requires a number of repeat samples to be taken from a single fossil assemblage. Collecting repeat samples is relatively simple for micropalaeontological assemblages (Bennington and Rutherford, 1999), however for macropalaeontological assemblages (vertebrate or invertebrate) where sampling effort is much higher, and for any species abundance data that have already been acquired, generating repeat samples is much more difficult. Consequently, an alternative method to estimate species abundances is required. A less frequently used approach to the estimation of the errors in measured species abundances uses the multinomial distribution (Patterson and Fishbein, 1989). The multinomial distribution is an extension of the binomial distribution where observations can fall within x, rather than 2, different categories (more information on the multinomial distribution can be found in Degroot and Schervish, 2002). From a statistical point of view, the relative abundances of species in a fossil assemblage can be considered to be the parameters of a multinomial distribution. To account for sampling effects, it is necessary to determine how accurately these abundances can be estimated from a sample of a given size (i.e. the size of the confidence interval within which the true species abundances lie). Applying a multinomial model, rather than

a binomial model, to the estimation of species abundances allows the simultaneous estimation of the confidence intervals for the abundances of all species in an assemblage. Using the multinomial distribution it is possible to calculate either the probability that relative abundances within the sample are similar to those in a fossil assemblage with a given degree of confidence (α), or the sample size (n) necessary to reach a certain α level. However, in order to calculate either of these parameters, a further factor must be considered: to what extent can the sample relative abundances vary from the assemblage abundances and yet still be considered “similar”? This takes the form of another variable, d (the “required degree of similarity”) which is the percentage by which sample abundance can vary from assemblage abundance (i.e. if d = 5% or 0.05 and true relative abundance = 0.4, sample abundance can lie between 0.35 and 0.45 and still be considered similar). The choice of the required degree of similarity depends on the nature of the data considered and the particular requirements of the study in question. With larger n, it will be possible to obtain a significant α level for smaller d. It is possible to calculate optimal-case species abundance estimates using the multinomial distribution, provided that the relative abundances of the species comprising the fossil assemblage are known with some degree of accuracy. It is rare to have such a priori knowledge of abundances in the fossil record, unless every fossil in the assemblage has been collected. If every fossil from an assemblage has been collected, there would be no need to carry out an analysis of this type. In many cases, the sampled species abundances are assumed to be good estimates of the true population species abundances to allow for an optimal-case analysis. This assumption may not be viable in some analyses; for example if calculating the sample size required for a future study. Alternatively the assumption may be judged to be inappropriate. Species abundances vary significantly between repeat samples of a fossil assemblage (Buzas, 1990). Consequently, the abundances of species in a single sample of an assemblage may not accurately reflect the true population abundances. A similar observation relating to environmental parameters rather than species abundances has been made by Olszewski (1999). For abundances calculated from a single or small number of samples it is therefore safer to take a worst-case approach, making no assumptions about the true species abundances. One of the useful properties of the binomial and multinomial distributions is that there are worst-case

J.R. Moore et al. / Palaeogeography, Palaeoclimatology, Palaeoecology 253 (2007) 317–322 Table 1 Sample sizes required to reach set confidence limits (α) that the relative abundances of objects within a sample are within d (%) of the true abundances of those objects, using the worst multinomial case α

0.750 0.900 0.950 0.990 0.999

d 1%

2%

5%

10%

20%

6390 10,348 13,342 20,293 30,238

1598 2587 3336 5074 7560

256 413 534 812 1210

64 104 134 203 303

16 26 34 51 76

A computer program to calculate intermediate values is available on request from the corresponding author.

values for the relative abundances in each of the x categories (individual species abundances) under consideration, at which point the maximum value of n for a given α and d (or minimum α level for a given n and d) is achieved. Providing that it is acceptable to over-estimate the sample size necessary to reach a particular confidence limit, or underestimate the confidence with which abundances can be constrained at a certain sample size, this allows the estimation of α or n in fossil assemblages without any a priori knowledge and, importantly, with the minimum number of assumptions regarding the underlying relative abundance distribution. For the binomial distribution (i.e. a sample with only two taxa or categories of skeletal element) the worstcase/minimum-assumption relative abundances are equal (i.e. 0.5). The situation for multinomial distributions is more complex, with the minimum-assumption abundances varying with α. The relationship between minimum-assumption parameter values and α has been determined by Thompson (1987) and from this it is possible to calculate minimum-assumption n values for a given value of d and α. For the minimum-assumption scenario, several of the relative abundances are equal and the remainder are zero. In this situation, m (the number of species abundances taking the same value) varies predictably between 2 and 6 as α increases from 0 to 1. Once this relationship has been determined, it is possible to use m along with d, α and z values from the normal distribution (where z is the upper (α/2m) × 100th percentile of the standard normal distribution) to calculate n using Eq. (1) (Thompson, 1987). Note that the use of max indicates that n varies with m and that the m required value of n is the maximum, given all values of m. In this instance, this can be achieved by substituting m values into the equation until the maximum n is found. ð1  1=mÞ n ¼ max z ð1=mÞ m d2 2

ð1Þ

319

It should be noted that these calculations apply no matter how many species abundances the sample contains (hence, for the minimum-assumption scenario, there can be any number of taxa or skeletal elements in a fossil assemblage and the values of n and α will be the same). The only constraint on the number of species that can be used in this model is that it must exceed m. Eq. (1) is ideal for calculating the sample size necessary to reach a desired confidence level, but does not allow the calculation of confidence level for a known sample size as each value of m relates to a range of α values. This latter calculation is more useful from a palaeontological standpoint as this allows the determination of the probability with which a collection of known size represents the assemblage from which it was drawn. Regressing α against d2n for the appropriate values of m (from the table presented by Thompson, 1987 – R2 = 0.9972) establishes the minimum-assumption relationship between confidence level, sample size and required degree of similarity (Eq. (2)). a ¼ 1:0975e2:3152d

2

ð2Þ

n

While it is possible to calculate d, n or α for any size fossil assemblage using this relationship, Table 1 gives some of the more common values, for reference. For example, it can be seen that in order to be 95% certain that sampled relative abundances lie within 5% of the true abundances of the fossil assemblage, a sample size of 534 individuals is required. This is larger than many of the samples that are used in published palaeoecological studies (for example Winkler, 1983: 435 skeletal elements from washed sample, Arribas and Palmqvist, 1998: 406 macromammal individuals), and indicates Table 2 95% confidence intervals for the relative abundances of taxa and skeletal elements given the sizes of the samples from Badgley (1986) and Russell (1967) Study

Badgley (1986), Assemblage I Badgley (1986), Assemblage II Badgley (1986), Assemblage III Badgley (1986), Assemblage IV Russell (1967), Oldman Formation

Taxon relative abundances

Skeletal element relative abundances

Sample size

d (%)

Sample size

d (%)

820

±4.03

1257

±3.26

271

±7.02

496

±5.19

76

±13.25

926

±3.80

56

±15.43

366

±6.04

320

±6.46

n/a

n/a

Calculated using Eq. (2).

320

J.R. Moore et al. / Palaeogeography, Palaeoclimatology, Palaeoecology 253 (2007) 317–322

Table 3 Ranges within which there is a 95% probability that the true taxon relative abundances for Badgley (1986) lie Taxon

Ramapithecidae Carnivora Proboscidea Equidae Chalicotheriidae Rhinocerotidae Suidae Anthracotheriidae Tragulidae Giraffidae Bovidae

Assemblage I

Assemblage II

Assemblage III

Assemblage IV

% Lower bound

% Upper bound

% Lower bound

% Upper bound

% Lower bound

% Upper bound

% Lower bound

% Upper bound

0 1.58 0 16.21 0 2.56 0.85 0 0.73 0.97 42.07

4.76 9.64 7.57 24.27 4.40 10.62 8.91 6.23 8.79 9.03 50.13

0 0 0.36 12.17 0 4.79 2.54 0 0 2.21 28.77

7.34 8.50 14.40 26.21 7.02 18.83 16.61 8.50 10.71 16.25 42.81

0 0 0 0 0 0 0 0 0 0 7.80

23.78 25.09 18.51 23.78 15.88 21.14 26.41 17.19 22.46 17.19 34.30

0 4.21 0 0 0 0 0 0 0 0 13.14

17.22 35.07 26.14 27.93 15.43 20.79 27.93 15.43 19.00 20.79 44.00

that caution must be exercised when assessing the relative abundances of taxa or skeletal elements in fossil assemblages. Considering d = 5% as a default required degree of similarity for future studies is reasonable, as it provides relatively good constraint on relative abundance without requiring impractically large sample sizes. Applying this methodology to real datasets, the limitations of small sample size on the accurate constraint of

relative abundance becomes clear. Two separate datasets will be considered: the classic study of the taphonomy of the Siwalik Group (Badgley, 1986), and the census of the Oldman Formation (Russell, 1967). Note that this bias towards vertebrate fossil assemblages represents the research interests of the authors, rather than a limit to the applicability of the methodology. Table 2 shows the 95% confidence intervals that can be placed on the relative

Table 4 Ranges within which there is a 95% probability that the true skeletal element relative abundances for Badgley (1986) lie Skeletal element

Isolated tooth Root/tusk Maxilla Mandible Skull Horn core Vertebra Scapula Pelvis Humerus Femur Radius Tibia Ulna Fibula Podial Calcaneum Astragalus Metatarsal Metacarpal Indeterminate metapodial Phalanx Patella Rib Shaft

Assemblage I

Assemblage II

Assemblage III

Assemblage IV

% Lower bound

% Upper bound

% Lower bound

% Upper bound

% Lower bound

% Upper bound

% Lower bound

% Upper bound

19.17 0 0 0.24 0 0 3.18 0 0 0 0 0 0 0 0 2.07 1.27 0 0 0 2.95 4.38 0 4.62 5.49

25.69 4.29 3.89 6.76 5.88 5.25 9.70 4.61 5.25 5.49 5.73 6.36 6.52 4.37 3.66 8.59 7.79 4.53 4.14 4.69 9.47 10.90 4.77 11.14 12.01

21.22 0 0 0 0 0 3.48 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 11.34 0

31.60 7.21 6.40 9.22 7.41 7.00 13.86 7.00 7.21 7.21 9.02 6.80 8.42 5.59 5.19 9.42 6.60 6.20 6.80 6.80 9.83 7.81 6.80 21.72 8.62

13.69 0 0 0 0.41 0 7.86 0 0 0 0 0 0 0 0 0 0 0 0 0 0.09 0 0 17.47 6.14

21.29 4.77 5.42 6.28 8.01 4.56 15.46 4.77 6.18 5.96 7.15 6.07 6.18 4.56 4.12 7.36 5.53 5.10 4.23 4.34 7.69 7.15 4.02 25.07 13.74

24.56 0 0 0.24 0 0 4.62 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1.06 0

36.64 8.77 7.13 12.32 10.14 7.13 16.70 8.50 7.13 8.77 8.77 8.23 8.50 8.77 7.13 10.68 7.41 7.68 6.59 6.86 9.05 10.68 7.13 13.14 7.13

J.R. Moore et al. / Palaeogeography, Palaeoclimatology, Palaeoecology 253 (2007) 317–322 Table 5 Ranges within which there is a 95% probability that the true taxon relative abundances for Russell (1967) lie Taxon

% Lower bound

% Upper bound

Hadrosaur Ankylosaur Ceratopsian Pachycephalosaur Large Theropod Small Theropod

35.42 4.48 17.60 1.35 1.04 1.35

48.34 17.40 30.52 14.27 13.96 14.27

abundances of taxa and skeletal elements for the two real datasets. Tables 3–5 show the range in which the taxon or skeletal element abundance estimates could lie given these confidence intervals. Above it was suggested that d = 5% was a reasonable default value for the required degree of similarity. In this case, if d N 5% for a dataset then the sample size is considered too small to adequately constrain the relative abundances of its component groups. Hence the relative abundances in any dataset where d N 5% should not be considered meaningful. Table 2 shows that, for many of the individual datasets considered here, d N 5% meaning that the relative abundances of much of the sampled data do not accurately represent the relative abundances of the taxa or skeletal elements in the fossil assemblages from which the samples were drawn. The exceptions are the skeletal element abundance data presented by Badgley (1986). Two of these datasets (Assemblage I and Assemblage III) show values of d sufficiently small to be considered accurate representations of the assemblage relative abundances. On examination of Tables 3, 4 and 5, even for those samples that are large and so show what are considered to be well constrained relative abundances (i.e. d b 5%), the ranges over which the taxon or skeletal element abundances could vary are large with respect to the overall relative abundances of many taxa or elements. For example, examining Table 4 shows that for the skeletal element data of Badgley (1986), Assemblage II, the 95% relative abundance confidence intervals for 22 of 25 categories overlap significantly, with some overlap for the remaining three categories. Each of the datasets analysed here is composed of a few abundant categories and large numbers of rare categories, a pattern that is common in population studies of any kind. The minimum-assumption approach used by this methodology to assess how well relative abundances have been estimated produces large errors for rare categories. This can only be overcome by increasing sample size or choosing a different abundance estimation method and making assumptions about the underlying relative abundance distribution.

321

This methodology allows the simple assessment of the accuracy of estimates of the relative abundances of taxa, skeletal elements, etc. in fossil assemblages: it can be used to determine the range within which relative abundances will vary with a certain probability for one sample or the sample size required so that the relative abundances will lie within a predetermined range. Conclusions Using the methodology presented in this paper it is now feasible to assess studies that formulate hypotheses using the relative abundances of objects (whatever these objects may be) in a fossil assemblage. Along with other related studies (for example Bennington and Bambach, 1996), this new approach provides important information regarding optimum sample size, and hence provides structure to the design of a collection protocol, for palaeontological studies that intend to test hypotheses concerning relative abundances. This methodology will also allow the testing of the significance of the absence of taxa or skeletal elements from a fossil sample (i.e. taxa with a zero abundance in the sample). Acknowledgements The authors would like to thank the Ian Karten Charitable Trust for providing the funding for this work as part of a University of Cambridge Millennium Scholarship. References Arribas, A., Palmqvist, P., 1998. Taphonomy and paleoecology of an assemblage of large mammals: hyenid activity in the lower Pleistocene site at Venta Micena (Orce, Guadix–Baza Basin, Granada, Spain). Geobios 31, 3–47. Badgley, C., 1986. Taphonomy of mammalian fossil remains from Siwalik rocks of Pakistan. Paleobiology 12 (2), 119–142. Bakker, R.T., 1972. Anatomical and ecological evidence of endothermy in dinosaurs. Nature 238 (5359), 81–85. Behrensmeyer, A.K., Boaz, D.E.D., Western, D., 1979. New perspectives in vertebrate paleoecology from a Recent bone assemblage. Paleobiology 5 (1), 12–21. Bennington, J.B., Bambach, R.K., 1996. Statistical testing for paleocommunity recurrence: are similar fossil assemblages ever the same? Palaeogeography Palaeoclimatology Palaeoecology 127 (1–4), 107–133. Bennington, J.B., Rutherford, S.D., 1999. Precision and reliability in paleocommunity comparisons based on cluster-confidence intervals: how to get more statistical bang for your sampling buck. Palaios 14, 506–515. Buzas, M.A., 1990. Another look at confidence limits for species proportions. Journal of Paleontology 64 (5), 842–843. Clark, J., Beerbower, J.R., Kietzke, K.K., 1967. Oligocene sedimentation, stratigraphy, paleoecology and paleoclimatology in the Big Badlands of South Dakota. Fieldiana. Geology Memoirs 5, 1–155.

322

J.R. Moore et al. / Palaeogeography, Palaeoclimatology, Palaeoecology 253 (2007) 317–322

Degroot, M.H., Schervish, M.J., 2002. Probability and Statistics. Addison-Wesley, Amsterdam. Dennison, J.M., Hay, W.W., 1967. Estimating the needed sampling area for subaquatic ecological studies. Journal of Paleontology 41 (3), 706–708. Farlow, J.O., 1976. A consideration of the trophic dynamics of a Late Cretaceous large dinosaur community (Oldman Formation). Ecology 57 (5), 841–857. Fatela, F., Taborda, R., 2002. Confidence limits of species proportions in microfossil assemblages. Marine Micropaleontology 45, 174–196. Hayek, L.C., Buzas, M.A., 1997. Surveying Natural Populations. Columbia University Press, New York. Kidwell, S.M., 2001. Preservation of species abundance in marine death assemblages. Science 294 (5544), 1091–1094. Kumar, K., 1992. Paratritemnodon indicus (Creodonta: Mammalia) from the early Middle Eocene Subathu Formation, NW Himalaya, India, and the Kalakot mammalian community structure. Palaeontologische Zeitschrift 66 (3–4), 387–403. Olszewski, T., 1999. Taking advantage of time-averaging. Paleobiology 25 (2), 226–238. Patterson, R.T., Fishbein, E., 1989. Re-examination of the statistical methods used to determine the number of point counts needed for micropaleontological quantitative research. Journal of Paleontology 63 (2), 245–248.

Russell, D.A., 1967. A census of dinosaur specimens collected in Western Canada. National Museum of Canada Natural History Papers 36, 1–13. Shotwell, J.A., 1955. An approach to the paleoecology of mammals. Ecology 36, 327–337. Shotwell, J.A., 1958. Inter-community relationships in Hemphillian (mid-Pliocene) mammals of Oregon and Texas. Ecology 39 (2), 271–282. Staff, G.M., Powell, E.N., Stanton, R.J., Cummins, H., 1985. Biomass – is it a useful tool in paleocommunity reconstruction? Lethaia 18 (3), 209–232. Staff, G.M., Stanton, R.J., Powell, E.N., Cummins, H., 1986. Timeaveraging, taphonomy and their impact on paleocommunity reconstruction – death assemblages in Texas bays. Geological Society of America Bulletin 97 (4), 428–443. Thompson, S.K., 1987. Sample size for estimating multinomial proportions. The American Statistician 41 (1), 42–46. Winkler, D.A., 1983. Paleoecology of an Early Eocene mammalian fauna from paleosols in the Clarks Fork Basin, Northwestern Wyoming (USA). Palaeogeography Palaeoclimatology Palaeoecology 43 (3–4), 261–298. Wolff, R.G., 1975. Sampling and sample size in ecological analyses of fossil mammals. Paleobiology 1 (2), 195–204.