Statistical inference for extinction rates based on last sightings

Statistical inference for extinction rates based on last sightings

Journal of Theoretical Biology 333 (2013) 166–173 Contents lists available at SciVerse ScienceDirect Journal of Theoretical Biology journal homepage...

1MB Sizes 1 Downloads 42 Views

Journal of Theoretical Biology 333 (2013) 166–173

Contents lists available at SciVerse ScienceDirect

Journal of Theoretical Biology journal homepage: www.elsevier.com/locate/yjtbi

Statistical inference for extinction rates based on last sightings Miguel Nakamura a,n, Pablo del Monte-Luna b,1, Daniel Lluch-Belda b,1, Salvador E. Lluch-Cota c,2 a

Área de Probabilidad y Estadística, Centro de Investigación en Matemáticas AC, Jalisco S/N, Col. Valenciana, CP 36240 Guanajuato, Gto., México Departamento de Pesquerías y Biología Marina, Centro Interdisciplinario de Ciencias Marinas del Instituto Politécnico Nacional, Avenida IPN s/n, Col. Playa Palo de Santa Rita, CP 23096 La Paz, BCS, México c Programa de Ecología Pesquera, Centro de Investigaciones Biológicas del Noroeste SC, Km. 1 Carretera a San Juan de la Costa “El Comitán”, AP 128, CP 23097 La Paz, BCS, México b

H I G H L I G H T S

    

Past and present extinction rates are generally assumed as constant. Marine extinctions are expected to rapidly increase in the future. We developed a model based on last sightings for estimating extinction rates. Recent marine extirpations are increasing but at an uncertain rate. A constant rate of modern extinctions in the sea is statistically plausible.

art ic l e i nf o

a b s t r a c t

Article history: Received 31 August 2012 Received in revised form 16 April 2013 Accepted 21 May 2013 Available online 31 May 2013

Rates of extinction can be estimated from sighting records and are assumed to be implicitly constant by many data analysis methods. However, historical sightings are scarce. Frequently, the only information available for inferring extinction is the date of the last sighting. In this study, we developed a probabilistic model and a corresponding statistical inference procedure based on last sightings. We applied this procedure to data on recent marine extirpations and extinctions, seeking to test the null hypothesis of a constant extinction rate. We found that over the past 500 years extirpations in the ocean have been increasing but at an uncertain rate, whereas a constant rate of global marine extinctions is statistically plausible. The small sample sizes of marine extinction records generate such high uncertainty that different combinations of model inputs can yield different outputs that fit the observed data equally well. Thus, current marine extinction trends may be idiosyncratic. & 2013 Elsevier Ltd. All rights reserved.

Keywords: Sampling effort Marine biodiversity Extirpations

1. Introduction Extinction is the ultimate fate of all species; in fact, 99.9% of all evolutionary lines that once existed on Earth have vanished (Jablonski, 2004). For some authors, such as Mayr (1942, 1970, 2001), Valentine (1973) and Futuyma (2009), the concept has been broadened to address population losses, also known as extirpations. In this study, we adopt this approach and include recent documented extirpations and extinctions in the marine realm

n

Corresponding author. Tel.: +52 473 732 7155; fax: +52 473 732 5749. E-mail addresses: [email protected] (M. Nakamura), [email protected] (P. del Monte-Luna), [email protected] (D. Lluch-Belda), [email protected] (S.E. Lluch-Cota). 1 Tel.: +52 612 12 30350; fax: +52 612 12 25366. 2 Tel.: +52 612 12 38484. 0022-5193/$ - see front matter & 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.jtbi.2013.05.015

(from 1500 onwards; see MacPhee and Flemming (1999) for the sake of a more complete analysis). Although extinction is a recurrent evolutionary phenomenon, it does not proceed at the same pace at all times. Whilst a relatively low number of species usually become extinct during any given time span (background extinctions), there are periods during which a large proportion of the biota is exterminated in a very short time on a geological timescale (mass extinctions (Mayr, 2001)). These two different extinction rates have been considered as a reference point for contextualising the current state of the global biological diversity (Pimm and Jenkins, 2010). 1.1. The analysis of extinction In terrestrial groups, estimates of recent extinction rates are between 100 and 1000 times higher than the long-term global average derived from the geological record (May et al., 1995).

M. Nakamura et al. / Journal of Theoretical Biology 333 (2013) 166–173

This phenomenon has been interpreted as evidence of the 6th mass extinction period in the Earth’s history (Barnosky et al., 2011). No similar estimation exists for marine groups, but the increasing concern that human development may be acting as a major driver of current and future marine extinctions and biodiversity loss (Myers and Worm, 2003; Worm et al., 2006; Dulvy et al., 2009; Millennium Ecosystem Assessment, 2005; Rogers and Laffoley, 2011), is propelling formal attempts to estimate the number of species that have recently disappeared from the oceans (Smith and Solow, 2012), incorporating probabilistic methods. Over the past 500 years there have been 21 documented extinctions in the sea (Dulvy et al., 2003; del Monte-Luna et al., 2007), which corresponds to an average constant rate of 0.04 species per year (1 species every 25 years). This value is lower than the fossil background rate, estimated as 1 species per year (May et al., 1995). Another way to measure extinction rates is as a function of the extant biota per unit of time; i.e., the number of extinct species per million species per year. If there are 21 documented cases of contemporary global marine deletions, and considering an estimate of 1.4 to 2.2 million extant species in the ocean (Bouchet, 2006; Mora et al., 2011), then the current marine extinction rate would be  0.03 to  0.02 species per million species per year. Using this metric, historical background extinction rates stand between 0.1 and 1 species per million species per year or less (Primm et al., 1995; Pimm and Brooks, 2000). However, as noted by Regan et al. (2001), there are important issues to be considered when comparing present and past extinction rates, including the disparity of temporal scales, the amount of effort applied to detect modern extinctions and the uncertainty associated to the estimation of the total biodiversity. Thus, valid comparisons of this type demand more accurate information about modern extinction trends in the ocean. In the present contribution we show that it may be possible to infer extinction rates of a group of species when only the time of the most recent sighting is available, and independently of the total number of species on Earth. The exact time of extinction is unknown in most cases, therefore the time of last sighting or the last few sightings of a species are frequently used as indirect proxies for the true extinction date; however, this practice is highly controversial since there are numerous cases of species that have been sighted long after being declared extinct (Wignall and Benton, 1999). Many statistical methodologies for estimating extinction dates assume that a representative record of historical sightings precedes the time of last sighting (Roberts et al., 2010; Rivadeneira et al., 2009). In reality, however, the observational basis for inferring extinctions is very weak (Smith and Solow, 2012). Specifically, historical sighting records are difficult to secure and last sightings are often the only realistic data available (Dulvy et al., 2003, 2009). We explore a probabilistic approach to determine how recent marine extinction rates have changed over the past 500 years by: (1) proposing a mathematical description of the relationship between extinction, sampling effort, and time of last sighting; (2) describing how statistical inference may be conducted if the only reliable data point is the time of last sighting; and (3) applying the proposed method to datasets on recent marine extirpations and extinctions. We acknowledge that this study is based on the marine extirpations and extinctions that have been documented over the past 500 years, which may not coincide with the number of populations and species that have been lost without being recorded.

167

2. Models and inference for extinctions based on last sightings In formal statistical analysis, data are regarded as random observations that can be described through a probability density. This density depends on unknown parameters that become the items of interest in scientific studies (Sprott, 2000). In this section, we derive a probability density for the time of last sighting by conceptualising the data acquisition process. The time of last sighting results from the combination of the biological species itself, which is the entity subject to extinction, and the human observers, who record what is being sighted. These processes occur in time and, with rare exceptions (Turvey et al., 2007), they can be considered unpredictable. We begin by characterising probabilistically these two processes observed over a given time interval ½0; τ. 2.1. Modelling extinction, extinction rates, and sampling effort For describing the biological extinction of a given species, let us denote by E its random time to extinction, measured from time t ¼ 0. Assuming that we have observed over the interval ½0; τ, then E 4 τ means that the species did not become extinct within that lapse and E≤τ indicates that the species became extinct during the observed time frame, even if we do not know the exact value of E. For a general description of the probabilistic behaviour of E, we can use a continuous probability density gðtÞ over the domain of the positive real numbers. A cumulative probability distribution function is then implicitly defined as Z t GðtÞ ¼ gðuÞdu 0

ðor equivalently gðtÞ ¼ ðd=dtÞGðtÞÞ: Answering questions regarding extinction rates will be essential in our discussion. A complementary probabilistic concept, adopted from the survival analysis and reliability theory (Lawless, 2003), will be useful for that purpose: the hazard function (sometimes called failure rate), which for E is defined as hðtÞ ¼

gðtÞ : 1−GðtÞ

It has the interpretation of being an “instantaneous rate of extinction” in the following sense: hðtÞ ¼ lim

Δt-0

Pðt≤E o t þ ΔtjE 4 tÞ : Δt

For small Δt, hðtÞΔt is approximately the probability that the species will become extinct within the next Δt units of time given that it has not yet become extinct by time t. An important special case is a constant hazard, hðtÞ ¼ c, which can be seen to correspond to an exponential density, gðtÞ ¼ c expð−ctÞ. However, as the hazard function describes variation in the risk of extinction as a function of time, we may also refer to decreasing or increasing extinction rates. Addressing extinction rates via a hazard function appears to be a novel concept. Average measures of extinction rates of the form #extinctions 106  #species t 2 −t 1 or similar, are the norm in the biological literature (Regan et al., 2001). In this particular metric, the number of extinctions is calculated between years t 1 and t 2 and standardised to one million years. Such estimates assume that the extinction rate is constant, yet our interest is in investigating whether such rates vary over time. The hazard function, in contrast, is an instantaneous rate (an analogy is the rate shown by an automobile speedometer at any time vs. a mean rate given as the total distance travelled divided by the total time for a given trip).

168

M. Nakamura et al. / Journal of Theoretical Biology 333 (2013) 166–173

The hazard can be converted to units of species/time as follows. If NðtÞ is the total number of extant species at time t (the magnitude of the biodiversity), then NðtÞhðtÞΔt is the expected number of species going extinct in the next Δt units of time (for Δt small), and NðtÞhðtÞΔt=Δt ¼ NðtÞhðtÞ is the number of extinct species per unit of time (instantaneous rate). Both interpretations depend on the magnitude of NðtÞ; however, over the past 500 years, NðtÞ may still be extremely large compared to the number of documented extinctions (particularly in the marine realm), so that NðtÞ is approximately equal to N, the total number of extant species that were present at time t ¼ 0. This approximation allows the definition of an interesting quantity, the hazard ratio hðtÞ=hðt ref Þ ¼ NhðtÞ=Nhðt ref Þ, which is dimensionless and independent of N. Here, t ref is an arbitrary time reference to which we compare hazards at all other times, t. In the examples developed below, we fixed t ref as the year 2000. Regan et al. (2001) noted that care must be taken when calculating extinction rates that depend on uncertain quantities such as N, which may vary by more than one order of magnitude. The hazard ratio gives only the relative change in extinction velocity between times t and t ref ; absolute values of the rates are non-identifiable, but the result that this ratio does not depend on N, shields it from the uncertainty associated with N, which is an important conceptual gain. In contrast, successive historical sightings of the species are a sequence of random times occurring in the interval ½0; τ (S1 oS2 o S3 o …). We describe random points in time using a Poisson process with intensity function λðtÞ (Durrett, 2010). Different choices for the function λðtÞ allow for general instances of non-homogeneous sighting intensities, perhaps due to non-homogeneous sampling effort, with a constant intensity as a particular case. The cumulative intensity function, defined by Z t ΛðtÞ ¼ λðuÞdu; 0

will provide convenient notational simplification in what follows. 2.2. Analysis of last sightings To summarise, biological extinction is described via a density function gðtÞ—or the equivalent hðtÞ or GðtÞ—whereas human sighting effort is described by an intensity function λðtÞ. We suppose the existence of a time of last sighting, denoted by L. With our notation for random variables, L ¼ maxfSi : Si ≤Eg. In the Appendix, we show that the probability density function for L is given by Z ∞ ΛðxÞ−ΛðyÞ e f g;λ ðxÞ ¼ λðxÞ gðyÞdy: −ΛðyÞ x 1−e In this equation, functions gðtÞ and λðtÞ (or ΛðtÞ) act as parameters; the notation displaying g; λ as a subscript is to emphasise that f actually depends on these two functions of t. We are not implying that the functions gðtÞ and λðtÞ are known or easy to specify. They are theoretical constructs conceived to explain random occurrences of E and Si and indirectly to explain random occurrences of L. Extinction correlates, such as varying physical factors, reduced population densities and range contractions, may determine gðtÞ. Likewise, λðtÞ might be influenced by factors including sporadic wartime economies, imperial expansions, and scientific expeditions (Sarkar et al., 2008). 2.3. Statistical inference Because the main parameter of interest in extinction studies is gðtÞ, the statistical challenge is to infer gðtÞ (or hðtÞ) from L. In statistics, the parameter λðtÞ is termed a nuisance parameter

(Pawitan, 2001) in the sense that it plays a key role in determining the distribution of observed data. A most important consequence of having obtained a density function for the observed time of last sighting is that it enables statistical inference via likelihood methods (Pawitan, 2001). Suppose L1 ; L2 ; …; Ln are the n observed last sightings of an ensemble of species. For the i-th species, there is a pair of functions g i ðtÞ and λi ðtÞ that describe the probability density of Li , as explained previously. If such observations are statistically independent among species in the ensemble, one may write down the joint probability function of L1 ; L2 ; …; Ln , and thus a likelihood function, as Lðg 1 ; …; g n ; λ1 ; …; λn ; L1 ; L2 ; …; Ln Þ ¼ ∏ni¼ 1

f gi ;λi ðLi Þ ; F gi ;λi ðτÞ

or Lðg 1 ; …; g n ; λ1 ; …; λn Þ for short. Here, the previously derived density has been standardised by F gi ;λi ðτÞ because observations are restricted to the interval ½0; τ. This expression for L is very general because for n observations, 2n parameters participate, all of which are functions, leading to over-parameterisation. However, certain simplifying assumptions may be introduced. We recognise at least four general approaches that yield workable modelling approximations to the situation at hand: (a) structural assumptions, (b) parameterisation, (c) the inclusion of covariates, and (d) hierarchisation. We define structural assumptions as the general conditions implied in the biological context, regardless of the particular values of functions hi ðtÞ and λi ðtÞ. For example, the ensemble of species may be such that they may all be assumed to be observed under a common, characteristic sampling effort trend. This would mean that all functions λi ðtÞ may sometimes be assumed to be equal to a single λðtÞ. Similarly, perhaps the ensemble is such that the extinction rate is similar, at least for heavily exploited species characterised by narrow geographical ranges and low reproductive rates or species subject to increasing habitat loss. For parameterisation, we must specify functions with a small number of unknowns, e.g., polynomial or exponential. This specification means that either hi ðtÞ or λi ðtÞ or both are postulated to be of the form φi ðt; δÞ, where φi is a known function and δ is a short vector of unknown parameters. In the expression for the likelihood, the corresponding function parameter (h or λ) is replaced by the functions of δ parameters, converting the fully non-parametric situation into a parametric one. The inclusion of covariates means that one may be able to observe a variable XðtÞ that is conceptually related to hi ðtÞ or λi ðtÞ. For example, if XðtÞ is the total accumulated number of marine species described by time t, it may be reasonable to assume that this is related to λðtÞ, e.g., ΛðtÞ ¼ cXðtÞ (illustrated by the cases below). Classical regression models are an example of the incorporation of covariates parametrically. Combining the notions of parameterisation and covariates in general means to write either hi ðtÞ or λi ðtÞ as φi ðt; XðtÞ; δÞ. A hierarchical model is one in which some of the parameters that specify the likelihood may vary randomly. For example, assume that the i-th species in the ensemble has extinction density function gðt; ai Þ ¼ ai t ai −1 expð−t ai Þ for positive constants ai . Now assume that these positive constants have an exponential density given by γe−γa . The density for L is now Z ∞ ðe−ΛðxÞ at a−1 expð−xa Þ 0

þ eΛðxÞ λðxÞ

Z

∞ x

e−ΛðyÞ at a−1 expð−ya ÞdyÞγe−γa da;

M. Nakamura et al. / Journal of Theoretical Biology 333 (2013) 166–173

which ultimately depends on Λ and γ. The latter is commonly called a hyperparameter, as it has been invoked to model original parameters ai . As seen, hierarchical modelling may involve additional numerical issues (integration), as well as the need to discern what parameters are random and what density describes them (Royle and Dorazio, 2009). Assume that some sort of modelling reduction device has been adopted and that two generic finite-dimensional parameters are identified for describing all g's and λ0 s, say θ and β. The likelihood is then written as Lðθ; βÞ ¼ ∏ni¼ 1

f gi ðθÞ;λi ðβÞ ðLi Þ ; F gi ðθÞ;λi ðβÞ ðτÞ

where θ is the parameter of interest and β is a nuisance parameter. Let d be the dimension (number of elements) of θ. A likelihood approach is appealing because it provides estimation and testing procedures for θ in the presence of a nuisance parameter by means of the profile likelihood (Pawitan, 2001), defined by ^ LP ðθÞ ¼ Lðθ; βðθÞÞ; ^ where βðθÞ is the value of β that maximises Lðθ; βÞ for each fixed value of θ. Let θ^ and β^ denote the maximum likelihood estimates. An approximate ð1−αÞ  100% confidence region for θ can be obtained in terms of relative profile likelihood as ^ βÞ ^ 4 exp½−χ 2 ðαÞ=2g; fθjLP ðθÞ=Lðθ; d where χ 2d ðαÞ is the upper α quantile of the χ 2 distribution with d degrees of freedom (Pawitan, 2001). For α ¼ 0:05 and d ¼ 1, the cutpoint is exp½−3:84=2 ¼ 0:14; this is the horizontal line in panels (c) of Figs. 1–3, and the values of the parameter that exceed the cutpoint are the confidence interval.

169

2.4. Illustration We applied the statistical inference procedure to three available datasets on recent marine extinctions and extirpations. The first example is based on Dulvy et al. (2003), who proposed a list of 112 recent extirpations that have recently occurred in the marine realm. Of these, we only considered the 78 cases for which a last sighting date is available. The second is derived from del MonteLuna et al. (2007) and includes 28 confirmed extirpations. The third dataset consists of 21 recent global extinctions of marine species on which both authors agree. Computations and graphics were written in the R software environment (R Development Core Team, 2012). For all datasets time zero is 1500 AD (considered by MacPhee and Flemming, 1999) as the beginning of the modern era regarding extinctions and extirpations), and time τ is 2000 (more recent last sighting of a marine species). Based on the fact that the first extirpation data point occurs in 1640, we assumed that there were no last sightings, and therefore no extinctions, between 1500 and 1640. For simplicity, we will describe extinctions with a common hazard function. Our illustration will consider the Weibull hazard, described by two parameters and given by   θ1 t θ1 −1 hðt; θ1 ; θ2 Þ ¼ : θ2 θ2 If θ1 ¼ 1, the density is exponential, so that the hazard is constant; if θ1 4 1, it monotonically increases; and for θ1 o1, the hazard is decreasing. Thus, this family contains cases of biological relevance for investigating trends. For the accumulated intensity function, we postulate ΛðtÞ ¼ βt 2 , or λðtÞ ¼ 2βt. This specification follows from a consideration of data compiled by the World Register of Marine Species (Appeltans et al., 2012) consisting of the number of marine species described per year since 1758, the year of publication of

Relative hazard

Extinction rate 0.8

0.4

0.0 1500

1700 Year

1900

Relative profile likelihood Relative profile likelihood

Density of last sightings

0.012 0.006 0.000 1500

1700 Year

1900

0.8

0.4

0.0 5

6

7 8 Parameter θ1

9

Fig. 1. Model outputs for all marine extirpations (see References Dulvy et al. (2003, 2009)): (a) estimated instantaneous rate of (total) recent marine extirpations rescaled with respect to its value at time 2000; (b) the estimated density of last sightings with a histogram of observed last sightings superimposed; and (c) the profile likelihood for the parameter θ1 along with an approximate 95% confidence interval. Dotted lines indicate the fit corresponding to the maximum likelihood estimate. Recent marine extirpations run from 1500 onwards (MacPhee and Flemming, 1999).

170

M. Nakamura et al. / Journal of Theoretical Biology 333 (2013) 166–173

Relative hazard

Extinction rate 0.8

0.4

0.0 1500

1700 Year

1900

Density of last sightings Relative profile likelihood

Relative profile likelihood

0.012 0.006 0.000 1500

1700 Year

0.8

0.4

0.0

1900

3

4

5

6

7

8

9

Parameter θ1

Fig. 2. Model outputs for verified-only marine extirpations (see Reference del Monte-Luna et al. (2007)): (a) estimated instantaneous rate of recent (confirmed) marine extinctions rescaled with respect to its value at time 2000, (b) the estimated density of last sightings with a histogram of observed last sightings superimposed, and (c) the profile likelihood for the parameter θ1 along with an approximate 95% confidence interval. Dotted lines indicate the fit corresponding to the maximum likelihood estimate. Recent marine extirpations run from 1500 onwards (MacPhee and Flemming, 1999).

Relative hazard

Extinction rate 0.8

0.4

0.0 1500

1700 Year

1900

Density of last sightings Relative profile likelihood

0.004

Relative profile likelihood

0.002

0.000 1500

1700 Year

1900

0.8

0.4

0.0 1.0

1.5

2.0 2.5 3.0 Parameter θ1

3.5

Fig. 3. Model outputs for data on recent marine extinctions (see References Dulvy et al. (2003, 2009), del Monte-Luna et al. (2007)): (a) estimated instantaneous rate of recent marine extinctions rescaled with respect to its value at time 2000; (b) the estimated density of last sightings with a histogram of observed last sightings superimposed; and (c) the profile likelihood for the parameter θ1 along with an approximate 95% confidence interval. Dotted lines indicate the fit corresponding to the maximum likelihood estimate. Recent marine extinctions run from 1500 onwards (MacPhee and Flemming, 1999).

Linnaeus’s Systema Naturae 10th Edition, in which the binomial system was formally introduced for zoological names (Heppel, 1981). This variable follows an approximate linear trend in t. The assumption that the intensity of sampling effort is proportional

to the number of descriptions suggests that λðtÞ is proportional to t (or that ΛðtÞ is proportional to t 2 ). In the examples to follow, θ1 , the shape parameter of the Weibull hazard, is particularly relevant because it determines a trend over

M. Nakamura et al. / Journal of Theoretical Biology 333 (2013) 166–173

171

Table 1 Maximum likelihood estimates of the parameters of the density model for L (see Section 2.4). For 95% confidence intervals for the parameter θ1 , see panel (c) in Figs. 1–3. Data set

θ1 Estimate

θ2 Estimate

β Estimate

p-Value for H 0 : θ1 ¼ 1

Dulvy et al. (2003, 2009) (suspicious and verified extirpations) del Monte-Luna et al. (2007) (verified extirpations only) Global extinctions

6.96 5.69 2.04

2292.82 3346.14 1.000e5

0.42 0.42 1.47e-4

o 0.001 o 0.001 0.011

time. The likelihood function has two nuisance parameters, β and θ2 . For the three described cases, we obtained maximum likelihood estimates of the parameters (Table 1).

3. Results The results of the analysis are shown in Figs. 1, 2 and 3 for total extirpations, confirmed extirpations and global marine extinctions, respectively, all three containing identical panels. The dotted line in panel (a) is the maximum likelihood estimated extinction rate, rescaled with respect to its value at time 2000. This estimate reflects the notion of relative instantaneous extinction rate described previously. For example, a relative rate of 0.5 in the year 1500 means that the extinction rate at that time was 50% of what it is in the year 2000, regardless of the total number of extant populations/species. Notice that a constant rate of extinction over the whole observation period on the scale of this plot would be represented as a constant relative hazard at the value 1. Panel (b) shows the estimated density (dotted line) of last sightings together with a histogram (bars) of observed last sightings. If this estimated density and the histogram agree, then the proposed parametric model fits the observed data. Panel (c) is the profile likelihood for the main parameter of interest, with an approximate 95% confidence interval plotted as a horizontal line. If this interval does not contain the value θ1 ¼ 1, then the hypothesis of a constant extinction rate is rejected at a significance level of 0.05. The multiple grey curves plotted in panels (a) and (b) can be interpreted as plausible examples under inferential uncertainty. The dotted line is associated with the maximum likelihood estimate and corresponds to the centre of the confidence region in parametric space. The cluster of curves can be interpreted as containing the unknown function (relative extinction rate or density) with high confidence because of the way they were chosen to be included in the plot: random samples of parameter values selected from a 95% confidence region for ðθ1 ; θ2 ; βÞ were drawn, and the corresponding functions that they each specify were computed and plotted. A first evident feature in Figs. 1 and 2 (all extirpations and confirmed extirpations) is the expected effect of a larger sample size, the result of basing the analysis on sample sizes 78 and 28, respectively. As expected, the cloud of grey curves is more tightly grouped in Fig. 1. The empirical histograms are both well described by the fitted density. In these cases, there is convincing evidence that the extirpation rate is increasing in time. The difference resulting from the inclusion of all extirpations (nonconfirmed plus confirmed) in the analysis appears to be primarily quantitative. The confidence interval for the parameter is shifted slightly to the left in Fig. 2. This shift means that the estimated speed at which marine populations have been lost during the past 500 years does not change substantially whether confirmed or total extirpations are used in the analysis. 4. Discussion Roberts and Solow (2003) and Vogel et al. (2008) postulated a mathematical description for sighting records similar to the one

used in this study. Solow and Smith (2000) also consider the notion of a constant hazard waiting time (exponential) for extinction, assuming a constant Poisson rate and focusing on the complete record of sightings instead of the last ones. A constant hazard is also explicitly recognised in the Law of Constant Extinction addressed in Kitchell and Hoffman (1991), which is associated with an exponential distribution for extinction in evolutionary timeframes. In Fig. 3b, it can be appreciated that the analysis of global extinctions entails a different and relevant amount of inferential uncertainty. Because a histogram constructed with only 21 points is somewhat unreliable, a lack of agreement between observed and estimated last sightings should not be readily interpreted as a poor fit because the amount of uncertainty is substantial. Such uncertainty remains even after assuming a parametric form for the density. In contrast to the two previous examples (Figs. 1 and 2), the densities of last sightings contained in the confidence boundaries showed contrasting shapes (linear, exponential potential, and even negative trends). The large amount of statistical uncertainty can also be seen in the resulting relative extinction rate, which may either be increasing or constant (or even slightly decreasing in time; Fig. 3a). This variation is a consequence of the inclusion of the value 1 at the lower end of the confidence interval (a constant extinction rate). This result implies that given the limited amount of data, the hypothesis of a constant rate of modern marine extinctions cannot be rejected; alternatively, the existence of an increasing extinction rate is not statistically conclusive. The most obvious conclusion from our illustration is that the amount of inductive uncertainty regarding the parameters of the model and the meaningful functions they describe is substantial. We conclude that this uncertainty is a consequence of: (1) small sample sizes; (2) the difficulty of extracting information from a limited amount of observations, as is the case for recent marine extinctions; and (3) the challenging task of reasoning solely from last sightings. We have shown that for both Dulvy et al. (2003) and del Monte-Luna et al. (2007) datasets, the null hypothesis H 0 : θ1 ¼ 1 is rejected at the 0.05 level. This result means that the extirpation rate of marine populations has been increasing during the past 500 years but that the magnitude of this rate is uncertain. Indeed, we see that statistically plausible values of the hazard function vary from modest to extreme increments. For global marine extinctions, the evidence is not as strong. The value θ1 ¼ 1 lies virtually on the lower edge of the confidence interval, indicating that a constant extinction rate is plausible within statistical certainty. Thus, the trend in the rate at which marine species have disappeared over the past 500 years is idiosyncratic. A few caveats are in order. Our illustration has assumed a number of simplifications, such as an equal extinction rate for all species in the ensemble. In an evolutionary context, the extinction rate of certain marine taxa can be inferred only if the fossil record is sufficiently complete. If available information is more limited for all datasets, as in this case, then the extinction rates may also be referred to the marine biota as a whole. Analysing data by subgroups thought to have homogeneous extinction rates is an option, as is the evaluation of more refined

172

M. Nakamura et al. / Journal of Theoretical Biology 333 (2013) 166–173

modelling strategies. Nevertheless, the issue of extreme uncertainty will persist or even be magnified if de facto reductions are made in the number of observations in each group or if additional parameters are incorporated. Alternative choices for the hazard function and/or sampling intensity function will not reduce the magnitude of the uncertainty; in fact, if less simplistic formulations than those we have used as an illustration are devised (e.g., a polynomial curve of degree n 42), the problem can only become more severe because the number of parameters has been increased. The incorporation of supplementary variables that contain information about extinction is one potential approach to the reduction of uncertainty (as a function of the reproductive potential, body mass, litter size or other extinction correlates), but we may then ask what information is realistically available for all endangered or extinct species. Our general formulation allows researchers to represent the sampling effort and extinction processes in a way that they find convenient.

because if there has been a sighting, then necessarily L≤y. If y 4 x, then  ∞  P 0 ðL≤xE ¼ yÞ ¼ ∑ P 0 ðL≤xjM ¼ n; E ¼ yÞP 0 ðM ¼ njE ¼ eÞ ∞



¼ ∑

n¼1

¼

n¼1

ΛðxÞ ΛðyÞ

n

e−ΛðyÞ ½ΛðyÞn 1 1−e−ΛðyÞ n!

∞ ½ΛðxÞn 1 ðeΛðxÞ −1Þe−ΛðyÞ e−ΛðyÞ ¼ ∑ : 1−e−ΛðyÞ n ¼ 1 n! 1−e−ΛðyÞ

Combining (A. 1) and (A. 2), we find Z x Z ∞ e−ΛðyÞ P 0 ðL≤xÞ ¼ gðyÞ dy þ ðeΛðxÞ −1Þ gðyÞ dy 1−e−ΛðyÞ −∞ x Z ∞ −ΛðyÞ e ¼ GðxÞ þ ðeΛðxÞ −1Þ gðyÞ dy −ΛðyÞ x 1−e Z ∞ ΛðxÞ−ΛðyÞ Z ∞ −ΛðyÞ e e ¼ GðxÞ þ gðyÞ dy− gðyÞ dy −ΛðyÞ −ΛðyÞ x 1−e x 1−e

ðA:2Þ

ðA:3Þ

(this last form is more convenient for numerical computation). A density function for L is now found by differentiation: Contributions Conceived the idea: MN, PdML, SLC, DLB. Data compilation: PdML, DLB. Data analysis: MN, PdML. Interpretation: MN, PdML, DLB, SLC. Manuscript writing: MN, PdML, SLC, DLB.

Funding CONACyT 101445, COFAA and EDI fellowships of the IPN, and 20130624 SIP Project.

d P 0 ðL≤xÞ dx

  Z ∞ −ΛðyÞ e−ΛðxÞ e ΛðxÞ λðxÞ gðyÞ dy ¼ gðxÞ þ ðeΛðxÞ −1Þ − gðxÞ þ e −ΛðyÞ 1−e−ΛðxÞ 1−e x Z ∞ −ΛðyÞ e ¼ gðxÞ−gðxÞ þ eΛðxÞ λðxÞ gðyÞ dy −ΛðyÞ x 1−e Z ∞ ΛðxÞ−ΛðyÞ e ¼ λðxÞ gðyÞ dy: ðA:4Þ −ΛðyÞ 1−e x

References Acknowledgements We thank COFAA, EDI and SIP of the IPN (20130624), as well as CONACyT Project 101445. We also thank two anonymous referees for having led to substantial improvements and corrections, including important technical issues.

Appendix Proof of last sighting distribution The argument is to compute the distribution function of L by conditioning on E ¼ y. Fix y4 0, and let M y be the (random) number of sightings in the interval ð0; yÞ: Two important properties of the Poisson process will be used in what follows: (i) the distribution of M y is Poisson with parameter ΛðyÞ; and (ii) for n≥1, conditionally on M y ¼ n, maxðS1 ; S2 ; …; Sn Þ has a distribution identical to that of maxfZ 1 ; Z 2 ; …; Z n g, where Z 1 ; …; Z n are n independent and identically distributed random variables, each with distribution function ΛðtÞ=ΛðyÞ for 0≤t≤y. That is, the distribution function of L given E ¼ y and M y ¼ n is ½ΛðtÞ=ΛðyÞn . We are interested in computing the distribution of a last sighting, which implicitly means that at least one sighting has occurred. The corresponding conditional (given at least one sighting) probability measure will be denoted by P 0 ; to distinguish it from the underlying probability P. Since PðM y ¼ 0jE ¼ yÞ ¼ e−ΛðyÞ ; the conditional density of M y given E ¼ y is given by P 0 ðM y ¼ njE ¼ yÞ ¼ e−ΛðyÞ ½ΛðyÞn =n!ð1=1−e−ΛðyÞ Þ; for n≥1: Now let x 40, and consider two cases: y≤x and x oy: If y≤x, then P 0 ðL≤xjE ¼ yÞ ¼ 1;

ðA:1Þ

Appeltans W., P. Bouchet, G.A. Boxshall, C. De Broyer, N.J. de Voogd, D.P. Gordon, B. W. Hoeksema, T. Horton, M. Kennedy, J. Mees, G.C.B. Poore, G. Read, S. Stöhr, T.C. Walter, M.J. Costello (Eds.), World Register of Marine Species, 2012. Available at 〈http://www.marinespecies.org〉 (accessed 27.03.12). Barnosky, A.D., Matzke, N., Tomiya, S., Wongan, G.O.U., Quental, B., Marshall, T.B.C., McGuire, J.L., Lindsey, E., Maguire, K.C., Mersey, B., Ferrer, E.A., 2011. Has the Earth’s sixth mass extinction already arrived? Nature 471, 51–57. Bouchet, P., 2006. The magnitude of marine biodiversity. In: Duarte, C.M. (Ed.), The Exploration of Marine Biodiversity, Scientific and Technological Challenges. Foundation BBVA, pp. 32–64. del Monte-Luna, P., Lluch-Belda, D., Serviere-Zaragoza, E., Carmona, R., ReyesBonilla, H., Aurioles-Gamboa, D., Castro-Aguirre, J.L., Guzmán del Próo, S., Trujillo-Millán, O., Brook, B.W., 2007. Marine extinctions revisited. Fish Fish. 8, 107–122. Dulvy, N.K., Sadovy, Y., Reynolds, J.D., 2003. Extinction vulnerability in marine populations. Fish Fish. 4, 25–64. Dulvy, N.K., Pinnegar, J.K., Reynolds, J.D., 2009. Holocene extinctions in the sea. In: Turvey, S.T. (Ed.), Holocene Extinctions. Oxford University Press, Oxford, pp. 129–150. Durrett, R., 2010. Essentials of Stochastic Processes. Springer-Verlag, New York. Futuyma, D., 2009. Evolution. Sinauer Associates, Inc., Sunderland, MA. Heppel, D., 1981. The Evolution of the code of zoological nomenclature. In: Wheeler, A., Price, J.H. (Eds.), History in the Service of Systematics. Society for the Bibliography of Natural History, London, pp. 135–141. (SBNH Special Publication no. 1). Jablonski, D., 2004. Extinction: past and present. Nature 427, 589. Kitchell, J.A., Hoffman, A., 1991. Rates of species-level origination and extinction: functions of age, diversity, and history. Acta Paleontol. Pol. 36, 39–67. Lawless, J.F., 2003. Statistical Models and Methods for Lifetime Data. Wiley, Hoboken, NJ. MacPhee, R.D.E., Flemming, C., 1999. Requiem Aeternam: the last five hundred years of mammalian species extinctions. In: MacPhee, R.D.E. (Ed.), Extinctions in Near Time: Causes, Contexts, and Consequences. Kluwer Academic/Plenum Publishers, New York, pp. 333–372. May, R.M., Lawton, J.H., Stork, N.E., 1995. Assessing extinction rates. In: Lawton, J.H., May, R.M. (Eds.), Extinction Rates. Oxford University Press, New York, pp. 1–24. Mayr, E., 1942. Systematics and the Origin of Species from the Viewpoint of a Zoologist. Harvard University Press, Cambridge, MA. Mayr, E., 1970. Populations, Species, and Evolution: An Abridgment of Animal Species and Evolution. Harvard University Press, Cambridge, MA. Mayr, E., 2001. What Evolution Is. Basic Books, New York. Millennium Ecosystem Assessment, 2005. Ecosystems and Human Well-Being: Biodiversity Synthesis. World Resources Institute. Washington, D.C.

M. Nakamura et al. / Journal of Theoretical Biology 333 (2013) 166–173

Mora, C., Tittensor, D.P., Adl, S., Simpson, A.G.B., Worm, B., 2011. How many species are there on Earth and in the ocean? PLoS Biol. 9, http://dx.doi.org/10.1371/ journal.pbio.1001127. Myers, R.A., Worm, B., 2003. Rapid worldwide depletion of predatory fish communities. Nature 423, 280–283. Pawitan, Y., 2001. In All Likelihood: Statistical Modelling and Inference Using Likelihood. Oxford University Press, Oxford. Pimm, S.L., Brooks, T.M., 2000. The Sixth extinction: how large, how soon, and where?. In: Raven, P. (Ed.), Nature and Human Society: The Quest for a Sustainable World. National Academy Press, Washington, pp. 46–62. Pimm, S.L., Jenkins, C.N., 2010. Extinction and the practice of preventing them. In: Sodhi, N.S., Ehrlich, P.R. (Eds.), Conservation Biology for All. Oxford University Press, New York, USA, pp. 181–196. Primm, S.L., Rusell, G.J., Gittleman, J.L., Brooks, T.M., 1995. The future of biodiversity. Science 269, 347–350. R Development Core Team, 2012. R: A language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria. 〈http:// www.R-project.org/〉. Regan, H.M., Lupia, R., Drinnan, A.N., Burgman, M.A., 2001. The currency and tempo of extinction. Am. Nat. 157, 1–10. Rivadeneira, M.M., Hunt, G., Roy, K., 2009. The use of sighting records to infer species extinctions and evaluation of different methods. Ecology 90, 1291–1300. Roberts, D.L., Solow, A.R., 2003. Flightless birds: When did the dodo become extinct? Nature 426, 245 245. Roberts, D.L., Elphick, C.S., Reed, M., 2010. Identifying anomalous reports of putatively extinct species and why it matters. Conserv. Biol. 24, 189–196.

173

Rogers, A.D., D.d´A. Laffoley, 2011. International Earth System Expert Workshop on Ocean Stresses and Impacts. Summary Report. IPSO, Oxford. Royle, A., Dorazio, R.M., 2009. Hierarchical Modeling and Inference in Ecology: The Analysis of Data from Populations, Metapopulations and Communities. Academic Press, London. Sarkar, N., Schenk, R., Norton, C.N., 2008. Exploring historical trends using taxonomic name metadata. BMC Evol. Biol. 8, 144, http://dx.doi.org/10.1186/ 1471-2148-8-144. Smith, W.K., Solow, A.R., 2012. Missing and presumed lost: extinction in the ocean and its inference. ICES J. Mar. Sci. 69, 89–94. Solow, A.R., Smith, W.K., 2000. Testing for a mass extinction without selecting taxa. Paleobiology 26, 647–650. Sprott, D.A., 2000. Statistical Inference in Science. Springer-Verlag Science, New York. Turvey, S.T., Pitman, R.L., Taylor, B.L., Barlow, J., Akamatsu, T., Barrett, L.A., Xiujiang, Zhao, Reeves, R.R., Stewart, B.S., Pusser, L.T., Kexiong, Wang, Zhuo, Wei, Xianfeng, Zhang, Richlen, M., Brandon, J.R., Ding, Wang, 2007. First humancaused extinction of a cetacean species? Biol. Lett. 3, 537–540. Valentine, J.W., 1973. Evolutionary Paleoecology of the Marine Biosphere. PrenticeHall, Englewood Cliffs, NJ. Vogel, R.M., Hosking, J.R.M., Elphick, C.S., Roberts, D.L., Reed, J.M., 2008. Bull. Math. Biol. 71, 701–719, http://dx.doi.org/10.1007/s11538-008-9377-3. Wignall, P., Benton, M., 1999. Lazarus taxa and fossil abundance at times of biotic crisis. J. Geol. Soc. London 156, 453–456. Worm, B., Barbier, E.B., Beaumont, N., Duffy, J.E., Folke, C., Halpern, B.S., Jackson, J.B. C., Lotze, H.K., Michell, F., Palumbi, S.R., Sala, E., Selkoe, K.A., Stachowicz, J.J., Watson, G., 2006. Impacts of biodiversity loss on ocean ecosystem services. Science 314, 787–790.