Accepted Manuscript Bayesian approach to OSL dating of poorly bleached sediment samples: Mixture 2 Distribution Models for Dose (MD ) Claire Christophe, Anne Philippe, Guillaume Guérin, Norbert Mercier, Pierre Guibert PII:
S1350-4487(16)30354-7
DOI:
10.1016/j.radmeas.2017.10.007
Reference:
RM 5849
To appear in:
Radiation Measurements
Received Date: 4 November 2016 Revised Date:
3 June 2017
Accepted Date: 11 October 2017
Please cite this article as: Christophe, C., Philippe, A., Guérin, G., Mercier, N., Guibert, P., Bayesian approach to OSL dating of poorly bleached sediment samples: Mixture Distribution Models for Dose 2 (MD ), Radiation Measurements (2017), doi: 10.1016/j.radmeas.2017.10.007. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
Claire Christophe
RI PT
Bayesian approach to OSL dating of poorly bleached sediment samples: Mixture Distribution Models for Dose (MD2) ab
, Anne Philippeb , Guillaume Guérina , Norbert Merciera and Pierre Guiberta
SC
a
M AN U
IRAMAT-CRP2A, Maison de l’archéologie, Université de Bordeaux Montaigne, Esplanade des Antilles 33607 Pessac Cedex, France. b Laboratoire de Mathématique Jean Leray, Université de Nantes, 2 Chemin de la Houssinière, 44322 Nantes, France.
October 16, 2017 Abstract
AC C
EP
TE
D
Optically Stimulated Luminescence (OSL) is commonly used to date the last exposure of grains extracted from sediments to sunlight. However, it is frequent that some of the measured grains were not sufficiently exposed to light before burial. Such samples are said to be poorly bleached. We propose a new statistical model based on a Bayesian approach to analyse OSL measurements performed on poorly bleached sediment samples. For such data, we propose a mixture model of Gaussian distributions to analyse equivalent doses (De ) distributions. This model can either be applied directly to the observed De values, or after a log-transformation. Bayesian analysis requires numerical approximation, to do this we use the JAGS (Just Another Gibbs Sampler) programme to run models using Markov Chain Monte Carlo simulations. We apply the model to synthetic datasets and real samples. Keywords: Optically Stimulated Luminescence dating, poorly bleached samples, Bayesian analysis, mixture distribution models
1
Introduction
Optically Stimulated Luminescence (0SL), first introduced as "Optical Dating" by (Huntley et al., 1985), is widely used to date the last exposure of grains extracted from sediments to sunlight. This method is based on the property of quartz and feldspar grains to store electrons in theirs crystalline defects, under the effect of ionizing radiation (e.g., ambient radioactivity). The zeroing mechanism is exposure to sunlight, that ejects trapped electrons. The sediment age is obtained by dividing the dose absorbed by the target grains, or palaeodose, by the dose rate they have been exposed to since the last exposure
1
ACCEPTED MANUSCRIPT
AC C
EP
TE
D
M AN U
SC
RI PT
to light. In practice, the palaeodose is usually estimated by equivalent doses using the SAR protocol (Murray and Wintle, 2000). This protocol associates, for each measured aliquot, the natural luminescence signal with a dose value through a projection on the dose response curve of the aliquot. However, it is quite common that grains in the sample were not sufficiently exposed to light to fully reset their OSL signal (or were incorporated by postdepositional effects Roberts et al., 1998, 2000). As a result, these grains indicate an equivalent dose (denoted De ) greater than the dose which is characteristic of the sample age, i.e. the equivalent dose includes a remnant dose corresponding to the luminescence signal that was not reset at the time of sediment deposition. Such samples are said to be poorly bleached. In some cases, it is possible to determine whether or not a sample is completely bleached using the comparison between quartz OSL and feldspar IRSL signals (Murray et al., 2012). Mixture distribution models are well suited to describe non-homogeneous populations. This is the case of poorly bleached samples, which include both incompletely and fully bleached grains. The challenge is then to select well bleached grains from the population of grains, in order to compute the sample palaeodose using only the De values of completely bleached grains. Mixture distributions are well documented, for an overview and applications we refer to (McLachlan and Peel, 2000; Zhang and Yangxin, 2015) and to (Marin et al., 2005) for a Bayesian point of view. In the field of OSL dating, (Thomsen et al., 2007) proposed a model (the IEU) to select the well-bleached grains based on an expected value of dispersion in De values; all grains showing higher De values than this cut-off are rejected from the analysis. Using a similar input parameter for the expected dispersion in the well-bleached population of grains, (Galbraith et al., 1999) proposed a statistical model for poorly bleached samples based on a mixture of distributions: the Minimum Dose Model (MDM, called Minimum Age Model in the original publication - see (Guérin et al., 2017) for the change in terminology). More precisely, the MDM describes the log-transformed De estimations by a mixture of a Gaussian distributions for well bleached grains and a truncated Gaussian distribution for poorly bleached grains. More details on the MDM and mixture of distributions are provided in Section 2. However, the MDM is developed in a frequentist setting, while Bayesian approaches are well suited to deal with archaeological data (Buck et al., 1996), since they make inference on the basis of both data and a priori expert knowledge (for instance stratigraphic constraints). The reader is referred to (Buck et al., 1991) for some applications in radiocarbon dating, to (Millard, 2006b) in thermo-luminescence dating and to (Millard, 2006a) in electron spin resonance. In OSL dating, (Rhodes et al., 2003) first implemented a Bayesian model using the OxCal programme, which does not take into account the specificities of luminescence dating; later, OSL dedicated Bayesian models have been proposed for completely bleached samples. (Sivia et al., 2004; Zink, 2013) contributed to modeling the palaeodose from the observed De values. This requires preliminary steps to estimate indivdual De values, using for instance the Analyst software, see (Duller, 2015). (Huntriss, 2008) and (Combès et al., 2015) proposed a Bayesian model to estimate the palaeodose from the luminescence measurements of each grain of the sample. However, in (Huntriss, 2008) some
2
ACCEPTED MANUSCRIPT
AC C
EP
TE
D
M AN U
SC
RI PT
choices appear to be impractical for applications to Quaternary sediments, while the model of (Combès et al., 2015; Guérin et al., 2015a) may be applied to all types of well bleached samples. Recently, (Combès and Philippe, 2017) proposed the computation of ages in stratigraphic constraints, taking into account systematic and random sources of errors. (Cunningham et al., 2014) proposed a Bayesian age model for OSL measurements of poorly bleached fluvial sediment, describing the remnant dose of an aliquot by a truncated Gaussian distribution. In our view, it seems difficult to justify why a truncated Gaussian distribution would accurately describe the equivalent doses of poorly-bleached grains; to our knowledge, such a justification has not been demonstrated in the literature, neither for a truncated Gaussian distribution nor for a truncated lo-gnormal distribution (as is assumed in the MDM). A series of simulations using simple assumptions regarding (i) the initial state of the grains, (ii) an exponential decay of luminescence as a function of exposure time and (iii) random light exposure processes, have shown that the expected distribution of remnant, apparent doses is described by an exponential law (Guibert et al., a). Finally, like the MDM, the model proposed by (Cunningham et al., 2014) requires the input of a characteristic, expected value of overdispersion for well-bleached samples, which is very problematic (Guérin et al., 2015c). As a result, we propose here to address the problem with a different approach. Let us recall that the difference between completely and poorly bleached samples appears at the step of palaeodose estimation from the equivalent dose of the grains. In the following, by poor bleaching, we refer to any process leading to a fraction of the grains of a sample for which the equivalent dose partly arises from the presence of remnant electrons trapped in the crystal at the time of sediment burial; such a definition includes insufficient light exposure during sediment transport, in situ roof spall in a cavity or chemical alteration of rocks resulting in the release of grains unexposed to light in the sediment, etc. For such poorly bleached samples, when measuring OSL at the single grain scale (Duller et al., 1999), some grains must be isolated and should not enter in the estimate of the palaeodose. However, the equivalent dose determination from the observation of luminescence signals emitted by a grain does not differ from a well bleached grain to a poorly bleached grain. In this article, we propose to expand the model developed in (Combès et al., 2015) to poorly bleached samples. We propose to amend the equivalent dose model defined in (Combès et al., 2015, Section 3.2.3). Instead of a Cauchy distribution (see Eq. (7) in Combès et al., 2015), we propose a Gaussian mixture model: the first component describes the sub-population of well bleached grains, and the others fit the population of poorly bleached grains. Since each grain has a potentially complex burial history, we cannot ensure that their equivalent dose distribution belongs to a standard parametric family. As a result, we propose to describe the De sub-population of poorly-bleached grains by a mixture of Gaussian distributions, in the spirit of non parametric approaches. Usually, especially for old samples, equivalent doses are described by log-normal distributions. Hence, we propose also the Gaussian model for the log-transformation of the equivalent doses, and we recommend to estimate the palaeodose by the mean of the log-normal distribution of the well bleached grains, as it is suggested in (Guérin et al., 2017). We also propose criteria to
3
ACCEPTED MANUSCRIPT
RI PT
choose the best model to describe the data. The outline of this paper is as following: in Section 2, we give the definition of a mixture of distributions and we present our Mixture Distribution Models for Doses (MD2 ). In Section 3, an implementation of the MD2 is proposed, as well as a selection method to find the model that is best suited for a given sample. In Section 4, the MD2 is applied to both simulated and real data.
Notation. We recall here the notations and probability density distributions used in the present article.
• B(p): Bernoulli distribution. Let X ∼ B(p) for p ∈ [0, 1], then P(X = 1) = 1 − P(X = 0) = p
SC
• Dir(α1 , ..., ακ ): Dirichlet distribution for α1 , ..., ακ > 0. The corresponding P probability density is defined as follows: (x1 , ..., xκ ) ∈ [0, 1] and κi=1 xi = 1, f (x1 , ..., xκ ) ∝ Πκi=1 xαi i −1 , where ∝ means an equality up to a multiplicative absolute constant.
M AN U
• N (µ, σ): Gaussian distribution for µ ∈ R and σ ∈ R∗+ . The corresponding probability density is defined as follows: x ∈ R, n(x|µ, σ) = (x−µ)2 1 √ exp − 2σ2 . 2 2πσ
• InvΓ(α, β): inverse-gamma distribution, for α, β > 0. The corresponding probability density is defined as follow: x ∈ R∗+ , f (x) ∝ x−α−1 exp −β x .
TE
D
2 Mixture models for poorly bleached samples
2.1
EP
To describe a population divided in sub-populations, a useful statistical model is a mixture of distributions. We provide a definition in the next section, and we discuss the MDM proposed by (Galbraith et al., 1999). Then, we present our model: MD2 .
Mixture of distributions and notations
AC C
Mixture of distributions. Such models can describe different features of data. They allow better descriptions of non-homogeneous populations and complex systems, and they are already applied in various domains: in astronomy, ecology, economics, biostatistics and in thermoluminescence (Zink, 2015, proposed a mixture model to reject outliers from a TL plateau test). Density function of a mixture of κ distributions is defined by ∀x ∈ R,
κ X
pk f (x|k),
(1)
k=1
P where (p1 , ..., pk ) ∈ [0, 1]κ satisfies κk=1 pk = 1; for all k ∈ {1, ..., κ}, f (.|k) is a density probability function defined on R. More precisely, for all k, pk is the proportion of the k −th sub-population of the mixture and f (.|k) characterizes this sub-population.
4
ACCEPTED MANUSCRIPT
D TE
density
0.000
0.002
0.004
0.006
M AN U
0.008
SC
0.010
RI PT
Let us illustrate such a model with an example of distribution of equivalent doses from a poorly bleached sample, FER2, from a slope deposit in the La Ferrassie rock-shelter (Dordogne, France). For details about this sample, the reader is referred to (Guérin et al., 2015b). In Figure 1, the observed De values are represented as a histogram. The black curve is the fitted mixture of four Gaussian distributions. Its first mode corresponds to the mode of the Gaussian density function with the smallest mean, which can be used to estimate the palaeodose based on presumably well bleached grains. The rest of the equivalent doses of incompletely bleached grains are fitted by the mixture of the other three Gaussian distributions. This sample will be detailed in Section 4.2.
EP
100
200
300
400
equivalent doses (Gy)
AC C
Figure 1: Histrogram of equivalent doses (N = 91 grains) for sample FER2. The black curve is a fitted mixture of four Gaussian distributions.
Notations. Let n be the sample size. Let (Dj )j∈{1,...,n} be the dataset, Dj denote the equivalent dose of the grain j, for all j ∈ {1, ..., n}. These doses are not directly measured, but we use the model defined in (Combès et al., 2015), or the Analyst software, to estimate the equivalent doses from the measured luminescence signals. In practice, De values are noisy as a result of estimation errors that result from the experimental uncertainties included in the estimation model of De . In (Combès et al., 2015), the posterior distribution for Dj contains theses errors, whereas in Analyst they are summarized by a standard deviation. Hereafter, we do not take into account the estimation error on the equivalent dose values to focus on the modeling of non-homogeneous populations of De .
5
ACCEPTED MANUSCRIPT
Let (D, σD ) stand for the sample palaeodose and the dispersion of equivalent doses. In the case of well bleached samples, the link between equivalent dose and palaeodose is defined by the following equation: Dj = D + σD εj ,
(2)
f (D|p, D, σD ) = pf1 (x|D, σD ) + (1 − p)f2 (x),
M AN U
∀D,
SC
RI PT
where εj is a random variable such that E(εj ) = 0 and var(εj ) = 1. However, in the case of poorly bleached samples, Eq. (2) does not hold for all equivalent doses, but only for equivalent doses of completely bleached grains. Since we do not know which grain is well bleached and which is not, a mixture of two distributions seems well-suited: one for describing well bleached grains and the other for poorly bleached grains (note: these two distributions may be very different, as detailed below; indeed, the poorly-bleached grains may not be well-described by a unique parametric family and may thus require a non-parametric approach for an accurate description). Using the previous notations and Eq. (1), let us define the density function of equivalent doses :
where p is the proportion of well bleached grains in the sample; f1 is a parametric density describing the equivalent doses of well bleached grains, with mean D and standard deviation σD ; and f2 is a density function describing the equivalent doses of poorly bleached grains. Parameters (p, D, σD ) are unknown, and the density functions f1 , f2 must be defined according to a model.
D
Discussion of the Minimum Dose Model. This model, proposed in
TE
(Galbraith et al., 1999), applies to the log-transformation of equivalent doses denoted (di )i∈{1,...,n} , and their density function is defined by f (di |γ, p, σ, µ) =
pn(di |γ, si ) | {z }
EP
well bleached density
+ (1 − p)n(di |µ, σ)1di >γ , {z } | poorly bleached density
AC C
where n(.|γ, si ) is a Gaussian density with mean γ and observed variance s2i (s2i is the estimated error associated to di ), and n(.|µ, σ)1.>γ is a Gaussian density with mean µ and variance σ 2 truncated at γ. Parameters p, γ, µ, σ are unknown, and a numerical method is proposed to approximate the maximum likelihood estimators. The palaeodose of the sample is estimated by exp(ˆ γ ), where γˆ is the maximum likelihood estimator of γ, which corresponds to the median of the log-normal distribution associated to the well bleached grains. A first remark is that the MDM does not include an estimation of the overdispersion of the well bleached population in this model. More precisely, this model requires as an input parameter, an estimate obtained "[...] from a well bleached sample of the same mineral derived from the same source and, ideally, of equivalent age to take into account time-dependent factors [...]" (Galbraith and Roberts, 2012, p17). For many samples, deriving such an estimate is a difficult task (even although in theory, for a suite of samples of similar age from a single location, with similar grain size distributions and radioisotopic concentrations, such an estimate may be guessed from well-bleached samples if there are demonstrably any). Secondly, in this model there are four unknown parameters and it is not obvious that the model is consistent for the studied
6
ACCEPTED MANUSCRIPT
SC
RI PT
sample, or that the model adequately fits the data. To overcome the consistency problem, (Galbraith and Roberts, 2012) proposed a three parameter model in which µ = γ. But this does not guarantee that the three-parameter model fits the data. In our MD2 model, instead of truncated distributions, we propose a non parametric density function to describe the poorly bleached population. Such densities do not require input parameters. For this reason, we think they may be more relevant to characterize the sub-population of poorly bleached grains, as these grains may have different histories of light exposure. In addition, we propose a mixture of distributions in a Bayesian setting, contrary to the MDM that was developed in a frequentist setting. Now, the question is how to estimate a non-parametric density function.
Estimation of non-parametric probability density function. In-
M AN U
finite mixture models have become the most popular method for density estimation in the Bayesian nonparametric literature (see Ghosh and Ramamoorthi, 2003, for an overview on Bayesian non parametric models). The most widely used is the infinite mixture of Gaussian distribution, where the unknown density f2 is written in the form Z f2 (y) = n(y|µ, σ)dG(µ, σ),
EP
TE
D
where, we recall that n(·|µ, σ) is the probability density function of a Gaussain distribution with mean µ and variance σ 2 , and G is an infinite discrete distribution (i.e. a discrete probability defined on a set with infinite cardinal, such as N∗ ). The distribution G can be written as an infinite sum of Dirac measures P∞ (denoted δ· ) : i=1 wi δµi ,σi where (µi , σi ) are independent and identically distributed random variables that are independent of the weights (ωi )i∈N . We can simplify the model by assuming that the component variances σi2 share a common value σ 2 , (see Ferguson, 1973; Ishwaran and James, 2002). Z ˜ f2 (y|σ) = n(y|µ, σ)dG(µ),
AC C
˜ = P∞ wi δµ . We adopt the simplest model, motivated by the fact where G i i=1 that we generally observe a small sample of De . Moreover, for practical reasons the number of components is often truncated to a maximum value, which is selected using a classical selection model criterion (see Section 3.2). It should be noted that identifiability of the model is ensured when prior distributions of mean parameters (µi )i∈N are assumed to be increasingly arranged or defined such that µ1 ∈ R+ and µi ∈ [µ1 , ∞[ for all i > 1.
2.2
Mixture Distribution Model for Dose: MD2
We introduce a mixture distribution model for two parametric families. First, we propose a Gaussian mixture model, which leads to relatively easy numerical convergence and is well suited for young samples, especially when some De estimates are negative. Secondly, we present a log-normal mixture model that is only valid when all De estimates are positive. The latter model is more in line with the common observation that the De observations are log-normally distributed (see Galbraith et al., 1999; Galbraith and Roberts, 2012).
7
ACCEPTED MANUSCRIPT
2.2.1
Gaussian mixture model
Let κ ∈ N∗ be fixed, here we consider a mixture of κ Gaussian distributions to describe the De distribution. Eq. (1) becomes: for all D ∈ R f (D|D, σD , p, q2 , ..., qκ , µ2 , ..., µκ , σ) = pn(D|D, σD ) + (1 − p)
κ X
qk n(D|µk , σ),
RI PT
k=2
Pκ
SC
where, for all k ∈ {2, ..., κ}, qk ∈ [0, 1] such that k=2 qk = 1, µk ∈ R and σ ∈ R∗+ . In this equation, n(.|α, β) is a Gaussian density function with mean α and standard deviation β. For the sake of simplicity, let us rewrite the previous mixture density: for all D ∈ R f (D|D, σD , p1 , ..., pκ , µ2 , ..., µκ , σ) = p1 n(D|D, σD ) +
κ X
pk n(D|µk , σ),
(3)
k=2
M AN U
where p1 = p and for all k ∈ {2, ..., κ}, pk = (1 − p)qk . For a fixed κ, ∈ N∗ , the parameter set Θκ = (D, σD , p1 , ..., pκ , µ2 , ..., µκ , σ) is unknown. To study mixture models, it is usual to introduce an unobserved variable Zj ∈ {1, ..., κ} (also called missing value or hidden random variable), which states from which Gaussian distribution Dj is an observation, for all j ∈ {1, ..., n}. More precisely, for all i, j ∈ {1, ..., n}, if i 6= j, Zi is independent of Zj and ∀k ∈ {1, ..., κ},
(4)
D
P(Zj = k) = pk .
TE
Using this unknown variable, instead of Eq. (3), we can define Dj as Dj |Zj , D, µ2 , ..., µκ , σD , σ ∼ N µZi , σZ2 j ,
EP
where µ1 = D, σ1 = σD and σk = σ for all k ∈ {2, ..., κ}. The Directed Acyclic Graph (DAG) in Figure 2 presents the structure of the Gaussian version of the MD2 .
AC C
Bayesian statistics. These statistics are based on Bayes formula, which states that data can adapt a prior information on the set of parameters Θκ to compute a posterior distribution of Θκ (see for instance Robert, 2001). An MCMC (Markov Chain Monte Carlo, for more informations see Gill, 2014) method, detailed in Section 3.1, is proposed to compute the posterior distributions of the parameters. Let us define the prior distributions for the parameter set Θκ .
Prior distribution choices. This step should by treated carefully, since different prior distributions can give very different posterior distributions. These priors should be based on prior knowledge of parameters, but one should not use the data in the determination of prior distributions. For the sake of identifiability of the model, we sort the means of the Gaussian distributions by increasing values. The parameter D is the smallest mean of the Gaussian distributions as it corresponds to well bleached grains, D < µ2 8
ACCEPTED MANUSCRIPT
σ
p1 ... pκ
D µ2
σD Zj
µκ Dj j ∈ {1, ..., n}
RI PT
...
M AN U
SC
Figure 2: Directed Acyclic Graph (DAG) of the Gaussian mixture model. The data (Dj )j∈{1,..,n} are estimated De values, parameters D and σD are common for all well bleached grains, and µ2 , ..., µκ and σ describe the distribution of De of poorly bleached grains. Variables (Zj )j are unobserved. and for all k ∈ {3, ..., κ}, it is assumed that µk−1 < µk . In addition, (p1 , ..., pκ ) are Pκ weights of Gaussian distributions, they are related through the equation k=1 pk = 1. Apart from these relationships, independence is assumed between the parameters of Θκ . Thus we can write the following prior distributions (see Figure 2): P(Θκ ) = P(D)P(σD )P(µ2 |D)Πκk=3 P(µk |µk−1 )P(σ)P(p1 , p2 , ..., pκ ).
EP
And,
TE
D
• Prior distributions for D, µ2 , ..., µκ : these parameters are means of Gaussian distributions comprising the mixture. In most OSL studies, the sample palaeodose (in Gy) belongs to [0,400], thus we propose D ∼ N 200, 2002 1]0,∞[ .
1]D,+∞[ , ∀k ∈ {3, ..., κ}, µk |µk−1 ∼ N 200, 2002 1]µk−1 ,+∞[ .
AC C
µ2 |D ∼ N 200, 2002
These prior distributions correspond to a nearly non informative prior for D, as the probability of the event D ∈ [0, 400] is at least equal to 0.68. In addition, the Gaussian distribution is truncated to be positive, since D is a palaeodose. For the other means, we propose a truncated Gaussian distribution to keep the order, while keeping a large variance to be little informative. It should be noted here that, if for a sample we can expect a lot of equivalent dose values exceeding 500 Gy, for archeological or/and geographical reasons (for example in the case of very old samples for which the dose response curves saturate at very high doses) then if Dmax is the highest regenerative dose in the SAR dose protocol, a solution could be to modify the mean and the standard deviation of the prior distribution of D and µk , replacing 200 by Dmax to 2 preserve the non informative choice of prior.
9
ACCEPTED MANUSCRIPT
where InvΓ is the inverse-gamma distribution.
RI PT
• Prior distributions for σD and σ: these parameters are standard deviations of Gaussian distributions, so we approximate the non informative Jeffreys prior for variance (c.f. Marin et al., 2005) using the following distribution: 1 1 2 2 , and σ ∼ InvΓ 0.1, , σD ∼ InvΓ 0.1, 1002 1002
2.2.2
Log-normal mixture model
SC
• Prior distribution for (p1 , ..., pκ ): the non-informative Jeffreys prior for the weight (p1 , ..., pκ ) is the Dirichlet distribution (see Marin et al., 2005, for details) with the following parameters: 1 1 . (p1 , ..., pκ ) ∼ Dir , ..., κ κ
M AN U
We assume that for all j ∈ {1, .., n}, Dj > 0. Let us consider for all j, dj = log Dj the log-transformation of the De . We propose to apply Model 3 to (dj )j : for all d ∈ R f (d|η1 , η2 , ..., ηκ , p1 , ..., pκ , τ1 , τ2 ) = p1 n(d|η1 , τ1 ) +
κ X
pk n(d|ηk , τ2 ).
(5)
k=2
EP
TE
D
This means that the equivalent doses of well bleached grains are distributed according to a log-normal distribution with parameters (η1 , τ1 ), and the equivalent doses of poorly bleached grains are distributed according to a mixture of log-normal distributions with a common scale parameter τ2 . In this case the sample palaeodose and its associated error are defined by the mean (as in Guérin et al., 2017) and the variance of the first component of the log-normal mixture, namely τ12
D = eη1 + 2 , q 2 eτ 2 − 1 eη1 +τ1 /2 . σD =
AC C
An alternative could be to define D as the median of the first component of the log-normal mixture, eη1 , as is proposed in (Galbraith et al., 1999; Galbraith and Roberts, 2012). But for dose rate reasons (see Guérin et al., 2017) the average dose absorbed by the well-bleached grains should lead to more accurate ages. Prior distributions used for the Gaussian mixture model are well suited for this model, except for the parameter η1 , where the following non-truncated prior: η1 ∼ N (200, 2002 ) is convenient. Contrary to the Gaussian mixture, η1 does not need to be positive, since it is not the palaeodose estimate.
3 Implementation and criteria for model selection For the MD2 model, the posterior distributions of the parameters are not analytically calculable. However, the Markov Chain Monte Carlo (MCMC)
10
ACCEPTED MANUSCRIPT
Implementation in R
SC
3.1
RI PT
method provides algorithms for sampling probability distributions. Therefore, we approximate the posterior distribution of the parameters for a mixture of κ Gaussian or log-normal distributions, for different fixed values of κ ∈ {2, 3, 4, 5, ...}. In this section, firstly we present numerical aspects for the implementation of our model, and we discuss the convergence assessment of the numerical methods (for instance the problem of convergences associated to MCMC methods). Secondly, we present a criterion to select the best number of components in the mixture, for both the Gaussian and the log-normal models. Thirdly, we offer a criterion to choose the best parametric family.
AC C
EP
TE
D
M AN U
In this study, we propose to implement the models in the R language and environment and to use the JAGS (Just Another Gibbs Sampler) simulation programme (see Plummer, 2003, 2015, for informations on the project and functionalities). This programme provides Markov chains to approximate the posterior distribution of Bayesian hierarchical models using a Metropoliswithin-Gibbs sampler. To implement our models using JAGS in R we use rjags and coda packages (see Plummer, 2013; Plummer et al., 2006). MCMC methods are algorithms based on the construction of a Markov chain, starting from a random value, that converges to the expected probability distribution. Once the chain is at equilibrium, the ulterior states of the chain are used as a sample of the posterior distribution. Consequently, two convergences are important. The first one is the convergence of each chain to an equilibrium distribution: to verify this, we recommend checking the Markov chain plot. Secondly, we have to ensure that the equilibrium distribution does not depend on the starting random value. We suggest running three (or even five) Markov chains, and then to use the Gelman and Rubin criterion (Gelman and Rubin, 1992, available in the "coda" package). This criterion computes the estimated variance of the parameter as a weighted sum of the whithin-chain and between-chain variance. A value close to 1 is expected. This equilibrium distribution is then the posterior distribution of the parameter.
Bayes estimator and credible interval. With the MCMC method, we obtain an estimate of the posterior distribution for each parameter of interest. Thus, to obtain a point estimate of the parameters, we propose the mean of the posterior distribution, which corresponds to the Bayes estimator using the square error risk. To take advantage of the information given by the posterior distributions, we also propose to compute the shortest credible interval or the Highest Posterior Density at 95% (or 68%) for each parameters.
3.2
Kind of Bayes Information Criterion: KoBIC
Using this numerical approach, we can obtain an estimate of the posterior distributions for parameters of mixtures of distributions with different component numbers. It can occur that the Bayes estimator of the palaeodose differs
11
ACCEPTED MANUSCRIPT
M AN U
SC
RI PT
from one mixture to another. In such cases, it is crucial to determine the best number of components required to describe the data. A Bayesian approach to estimate the optimal value of the number of components κ, would be to define a prior distribution on κ and numerically compute its posterior distribution. But this method may favor a large number of components to the detriment of consistency. In addition, the complexity of the numerical part increases drastically with this method and it suffers from numerical instability for small samples - as is often the case in luminescence samples. Another usual approach to compare Bayesian models is the use of Bayes factors (see Carlin and Louis, 1997, chapter 2), but when we deal with non informative prior distributions (as in our case) they might not be calculable. As a result, we propose to use a penalized criterion to select the best number of components. Such a criterion implements a trade-off between the goodness of fit of the model (which increases with the number of components) and the complexity of the model. A widely used criterion is the Bayes Information Criterion (BIC) (Schwarz, 1978), which can be viewed as a rough approximation to Bayes factors but is independent of the priors. This criterion is defined as follows: b M LE be the maximum likelihood Definition 3.1. Let n be the sample size, Θ κ estimator of the parameter set of the mixture of distributions. Let us define the penalized criterion as follow: for a mixture of κ distributions n
critBIC (κ) = −
κ log n 1X LE bM log f Dj |Θ + . κ n 2n
(6)
D
j=1
TE
Then the best model is the model such that the component number satisfies: κ ˆ = minκ critBIC (κ).
AC C
EP
The second term on the right hand side of Eq. (6) corresponds to the penalty function that is proportional to the number of components κ. Note that, before applying this criterion, we have to specify the parametric family: it can either be the Gaussian mixture model or the log-normal mixture model. The maximum likelihood estimator (MLE) is a frequentist estimator; in our Bayesian setting we calculate another estimator, so in Eq. (6) we propose to use Bayes estimate instead of the MLE.
3.3
Selection of the parametric family
Let us denote by fN the best Gaussian mixture model and by flN the best log-normal mixture model chosen using the KoBIC. An important question, when the posterior distribution of the parameter D is different for these two parametric families, is to determine which is the best model. Here, we are confronted with a testing problem, for which the null hypothesis is: H0: data are sampled from a Gaussian mixture, and the alternative is: H1: data are sampled from a log-normal mixture.
12
ACCEPTED MANUSCRIPT
f (D) = ρ1 fN (D) + ρ2 flN (D),
RI PT
Here again, we cannot use the Bayes factors to discriminate between the two assumptions. (Kamary et al., 2014) proposed an alternative method based on a mixture model that consists, in our setting, of a mixture model with two components fN and flN . The decision of the testing procedure is based on the posterior distribution of the weights of these two components; non-uniform distributions are expected. More precisely, we consider the following model: for all D (7)
ρ2
M AN U
ρ1
SC
where ρ2 = 1 − ρ1 . The weights are unique, unknown parameters. We refer to Figure 3 for the DAG of this Bayesian model.
Zj
Dj
D
j ∈ {1, ..., n}
TE
Figure 3: Directed Acyclic Graph (DAG) of mixture of models. The variable Zj is an unobserved data.
AC C
EP
Let us define the prior distributions for the parameters of interest (ρ1 , ρ2 ). As it is the vector of weights of a mixture, we propose as in Section 2.2 a non 1 1 informative Jeffreys prior: (ρ1 , ρ2 ) ∼ Dir 2 , 2 . As for the Gaussian mixture, we also incorporate an unobserved data: (Zj )j∈{1,...,N } , such that Zj ∼ B(ρ2 ), and f (Dj |Zj ) = (fN (Dj ))1−Zj (flN (Dj ))Zj . In this case, we can compute the full conditional distributions for each parameter ρ2 flN (Dj ) ∀j, Zj |ρ1 , ρ2 , Dj ∼ B , ρ1 fN (Dj ) + ρ2 flN (Dj ) X X 1 1 ρ1 , ρ2 |Z1 , ..., Zn ∼ Dir + 1Zj =0 , + 1Zj =1 . 2 2 j
j
Using a Gibbs sampler we can compute the posterior distribution of the weights. A posterior distribution of the weight ρ1 with an exponential increase to 1 and probability vanishing in a neighborhood of 0, or an exponential decrease from 0 and probability vanishing in a neighborhood of 1, are examples of posterior distributions of weights leading to an easy decision.
13
ACCEPTED MANUSCRIPT
4
Application to data
In this section, we illustrate the application of our model on two synthetic samples and then on two real samples: an "old" sample and a "young" sample.
Synthetic data
RI PT
4.1
4.1.1
M AN U
SC
Taking synthetic samples from a Gaussian or log-normal mixture model would not illustrate the advantage of our non parametric approach to model poorly bleached grains. Consequently, we chose to simulate samples from the mixture model of (Galbraith et al., 1999). More precisely, we propose to simulate an old sample by a mixture of a log-normal distribution and a truncated log-normal distribution; and a young sample by a mixture of a Gaussian distribution and a truncated Gaussian distribution. These models do not belong to a parametric mixture model where κ is fixed, but it should be well described by our MD2 model. In any case, our aim is to estimate the mean value of the first component of the mixture that models the well bleach grains. The KoBIC criterion is applied to select the best log-normal or Gaussian mixture to describe the truncated log-normal or Gaussian distribution.
Synthetic old dataset
A dataset, with size N = 150, was generated according to the following density function: (8)
D
f (D) = 0.6Ln(D|4, 0.3) + 0.4Ln(D|5, 0.6)1[e4 ,∞[ ,
EP
TE
where, 1S (x) : x 7→ 1 if x ∈ S and x 7→ 0 else, and Ln(.|α, β) is a log-normal density function with location parameter α and scale parameter β. The first component, with mean 57 and standard deviation 18, characterizes the well bleached grains. Figure 4 shows a histogram of the dataset. There is no negative value in the sample, thus we will apply both the Gaussian and the log-normal version of the MD2 .
Gaussian version of the MD2 . Using the JAGS software, we generate
AC C
three Markov chains for each parameter, that converge to its posterior distribution. Each Markov chain starts with a random initial value and we use a "burn-in" phase of 20000 iterations to forget the initial value. Then, 20000 iterations are necessary to adapt the tuning parameters of the algorithm. Finally, every five iterations over 20000 iterations are used to provide a sample of the posterior distribution. For our data, this number of iterations guarantees the convergency of MCMC methods. These iteration numbers are fixed for all the analyzed samples. For the dataset corresponding to Eq. (8), the MCMC algorithm converges for a mixture of 2, 3 and 4 Gaussian distributions. The density mixtures, defined by the Bayes estimators obtained with our model for a mixture of κ ∈ {2, 3, 4} Gaussian distributions are drawn in Figure 4. The point estimate and HPD (Highest Posterior Density) at 95% and 68% of the mean, standard deviation and weight of the first component of the model Eq. (8) are plotted in Figure 5, and compared with the true parameters.
14
ACCEPTED MANUSCRIPT
RI PT
The fitted distributions (see Figure 4) corresponding to mixtures of 3 or 4 components are similar. Using the KoBIC criterion, we select the model with 3 components, which corresponds to the least complex model among those yielding similar performance in terms of fitting (see Figure 5). We obtain the following estimates of the palaeodose, dispersion parameters and proportion: D = 58, σ ˆD = 14 (both values in Gy) and pˆ1 = 0.61 (see Table 1).
0.015
mixture of 2 Gaussian distributions
SC
mixture of 3 Gaussian distributions
100
D
0
200 300 equivalent doses (Gy)
TE
0.000
0.005
M AN U
density 0.010
mixture of 4 Gaussian distributions
400
500
AC C
EP
Figure 4: Histogram of the synthetic dataset of size N = 150 of the model defined in Eq. (8), and estimated density of a mixture of κ ∈ {2, 3, 4} Gaussian distributions.
15
2 2
2
1
1
3 kappa
4
2
2
2
2
2
55
RI PT
0.65
SC
Probability 0.60
M AN U
12
56
0.55
13
57
Dose (Gy) 58
Standar deviation (Gy) 14 15
59
16
60
17
61
2
0.70
ACCEPTED MANUSCRIPT
1
11
2
1
1
2
3 kappa
1
1
1
4
2
1
3 kappa
4
TE
D
Figure 5: Results obtained with the 2, 3 and 4 component mixtures in the Gaussian version of the MD2 model applied to the synthetic old dataset. HPD (Highest Posterior Density) at 95% (dark green) and 68% (light green) and Bayes estimator (blue point) for: the mean, the standard deviation and the weight (from left to right) of the first component of the model. The true values are indicated by the horizontal red lines. Log-normal version of MD2 . The MCMC algorithm converges for a
AC C
EP
mixture of 2 and 3 log-normal distributions. The density mixtures defined by the Bayes estimators obtained with the log-normal version of our model is drawn in Figure 6, and the point estimate, the HPD at 95% and 68% of the parameters of the first log-normal distribution are plotted in Figure 7. The fitted distributions corresponding to a mixture of 2 and 3 component are too similar to be distinguishable. Using the KoBIC criterion, we select the model with 2 components, which corresponds to the least complex model. We obtain the following estimates of the palaeodose and dispersion parameters (in Gy, according to what is suggested in Section 2.2.2) as well as of the proportion (see Table 1): b = 59, HP D(68%) = [56, 61], HP D(95%) = [54, 64], D σc D = 16, HP D(68%) = [13, 18], HP D(95%) = [11, 20]
(9)
pˆ1 = 0.62, HP D(68%) = [0.57, 0.70], HP D(95%) = [0.48, 0.74].
Choice of parametric family. We apply the criterion defined in Section 3.3 to select which one of these two models, both selected by the KoBIC in the Gaussian mixture family in a first step and in the log-normal mixture
16
ACCEPTED MANUSCRIPT
0.010
RI PT
mixture of 3 log−normal distributions
SC 0
M AN U
0.000
0.005
density
0.015
mixture of 2 log−normal distributions
100
200
300
400
500
equivalent doses (Gy)
D
Figure 6: Histogram of the synthetic dataset of size N = 150 of the model defined in Eq. (8), and estimated densities of a mixture of κ = 2 and 3 log-normal distributions (the two curves are too similar to be distinguishable).
AC C
EP
TE
family in a second step, is the best. Let us assume that the data are distributed according to the mixture model defined in Eq. (7). More precisely, we consider a mixture of a three-component Gaussian mixture and a two-component lognormal mixture, with weights ρ1 for the gaussian component, and ρ2 = 1 − ρ1 for the log-normal component. The posterior distributions of the weights (ρ1 , ρ2 ) are plotted in Figure 8; the mean of the posterior distribution of ρ1 is 0.43 and the mean of the posterior distribution of ρ2 is 0.57. Thus, it is not clear how to decide which of these two families better fits the data. Indeed, the posterior distribution of ρ1 has a non-zero probability to be in the neightborhood of 0 or 1. However, the palaeodose and standard deviation estimates are close when we consider a Gaussian mixture and a log-normal mixture. In this case, the choice of the parametric family is not crucial. It should be noted here that we have decided, to test the model, to work with a sample size typical of experimental datasets (N =150). As a result, the parameter estimates vary from one sample to another; in particular, the scale parameter is difficult to track in complex models such as mixture models applied to small-sized samples. In addition, it seems that for limited sample sizes, the scale parameter of the first component might be underestimated without any effect on the palaeodose (and thus age) estimation. This underestimation could be due to the chosen distribution from which the sample is drawn (Eq. (8)) - obviously, the real test will need to be performed on real,
17
2
2
0.65
RI PT
0.28
M AN U
0.50
0.22
0.55
SC
0.24
Probability 0.60
τ (without unit) 0.26
η (without unit) 4.05 4.00
1
0.20
1
3 kappa
2 2
0.70
0.30
4.10
2
1
2
0.75
2
0.32
ACCEPTED MANUSCRIPT
1
2
3
kappa
1
1
2
3 kappa
TE
D
Figure 7: Results obtained with the 2 and 3 component mixture in the log-normal version of the MD2 model applied to the synthetic old dataset. HPD at 95% (dark green) and 68% (light green) and Bayes estimator (blue point) for: the location parameter, the scale parameter and the weight (from left to right) of the first component of the model. The true values are indicated by the horizontal red lines.
AC C
EP
known-age poorly-bleached samples.
18
ACCEPTED MANUSCRIPT
2.0 rho_1 rho_2
RI PT
density
1.5
1.0
0.0 0.25
0.50 weight
0.75
1.00
M AN U
0.00
SC
0.5
Figure 8: Posterior distributions of the weights of the Gaussian component (pink, of weight ρ1 ) and of the log-normal component (blue, of weight ρ2 ), defined in Eq. (7) and Figure 3. 4.1.2
Synthetic young dataset
D
As in the previous sub-section, a dataset of size N = 150 was generated following the distribution defined by:
TE
f (D) = 0.7n(D|1, 0.4) + 0.3n(D|6, 4)1[1,∞[ .
(10)
AC C
EP
Consequently, the first component has a mean equal to 1, a standard deviation equal to 0.4 and its proportion in the whole population is 70% (see Figure 9 for a histogram of the dataset). For this dataset, there are negative values, so we can only apply the gaussian version of the MD2 . MCMC converge for a mixture of κ ∈ {2, 3, 4} Gaussian distributions, and estimated densities are shown in Figure 9. The point estimates and HPD at 95% and 68% of the mean, standard deviation and weight of the first component of the dataset are plotted in Figure 10, and compared with the true parameters. The fitted distributions of these mixtures are similar, here again the KoBIC selects the least complex model: the mixture of two Gaussian distributions.
19
0.8
ACCEPTED MANUSCRIPT
mixture of 2 Gaussian distributions
RI PT
0.6
mixture of 3 Gaussian distributions
0.0
M AN U
0.2
SC
density 0.4
mixture of 4 Gaussian distributions
0
5 10 equivalent doses (Gy)
4.2.1
Real data
TE
4.2
D
Figure 9: Histogram of a dataset of size N = 150 corresponding to the model defined in Eq. (10) and estimated densities of mixtures of κ ∈ {2, 3, 4} Gaussian distributions (the three curves are too similar to be distinguishable).
Data: FER2
AC C
EP
Here we propose to study sample FER2, from the La Ferrassie rockshelter in the Dordogne region of France. For more details about this sample, the reader is referred to (Guérin et al., 2015b). For this sample, the MCMC algorithm converges on a mixture of 2 and 3 Gaussian distributions. The corresponding estimated densities are drawn in Figure 11, and point estimates and HPD at 95% and 68% of the mean, standard deviation and weight of the component of equivalent doses of well bleached grains are plotted in Figure 12. According to the KoBIC, a mixture of three Gaussian distributions appears to be the best model. For this model, the estimated palaeodose, disb = 80, σ persion (both values in Gy) and proportion values are D bD = 30 and pˆ1 = 0.74 (see Table 1). For this sample, the MCMC algorithm does not converge for mixtures of log-normal distributions. A potential explanation for this is the fact that these data are well described by one log-normal distribution with mean 115. However, the comparison between quartz OSL and K-feldspar IRSL signals, as well as the stratigraphic inversion when calculating the age of this sample using e.g. the CAM (from Galbraith et al., 1999), both show that this sample was poorly bleached at the time of deposition. Therefore, a log-normal distri-
20
1.15
ACCEPTED MANUSCRIPT
2
2
2
2
2
2
2
2
1
2
SC 1
1
1
1
3 kappa
RI PT
0.70 Probability
0.65
M AN U
0.60
1.00
0.35
Dose (Gy) 1.05
Standar deviation (Gy) 0.40
1.10
0.45
0.75
2
4
2
1
3 kappa
4
1 1
1
2
3 kappa
4
TE
D
Figure 10: Results obtained with the 2, 3 and 4 component mixtures in the Gaussian version of the MD2 model applied to the synthetic young dataset. HPD at 95% (dark green) and 68% (light green) and Bayes estimator (blue point) for: the mean, the standard deviation and the weight (from left to right) of the first component. The true values are indicated by the red line.
AC C
EP
bution to describe this data cannot be considered as a valid model. Conversely, the Gaussian mixture model selects equivalent doses of well bleached grains (the first component of the mixture) to estimate the palaeodose. Based on radiocarbon ages obtained for the same archaeological layer, a palaeodose of ∼79 Gy is expected. Thus, the result obtained with the Gaussian mixture is completely satisfactory.
21
0.012
ACCEPTED MANUSCRIPT
mixture of 2 Gaussian distributions
SC 0
100
M AN U
0.000
0.004
density
RI PT
0.008
mixture of 3 Gaussian distributions
200 300 equivalent doses (Gy)
400
Sample from the medieval chapel "Saint Jean Baptiste"
TE
4.2.2
D
Figure 11: Histogram of FER2 sample of size N = 91. The estimated density of the dataset using MD2 for a mixture of κ ∈ {2, 3} Gaussian distributions are shown in green and black.
AC C
EP
The sample BDX17688 was extracted from a yellowish layer in archaeological connection with the foundations of the chapel "Saint Jean Baptiste de la Cité" in Périgueux (Dordogne, France). This monument, known as being a classified reference for the Renaissance style was in fact built in late roman times on the remains of an older roman villa. Several phases of construction occurred from the 3rd century AD until the Renaissance period (16th c. AD). Lime mortars and Cocciopesto (fragments of crushed brick mixed with lime and sand) samples were taken from different built structures, as well as sediments and elements of soils in order to characterize the chronological sequence of the whole building (Guibert et al., b). Amongst samples under investigation, BDX17688 is a particular one, since its identification on the site was not entirely clear. The archaeological team hesitated to identify this yellowish material either as mortar (assumption named A1) or as sedimentary soil artificially mixed with limestone fragments (including unbleached or very poorly bleached quartz grains, assumption named A2). From the point of view of the stratigraphy, it is in connection with the foundations of the medieval part of the chapel, archaeologically dated with certainty within the interval 9th - 13th c. AD (Gaillard et al., 2008). From this sample, 200-250 µm quartz grains were extracted for SG OSL measurements.
22
2
2
90
2
2
2
0.8 Probability 0.7
RI PT
35 Standar deviation (Gy) 30
1
SC
85
2
Dose (Gy) 80
0.6
1
25
75
0.9
40
ACCEPTED MANUSCRIPT
M AN U
70
1
1
2
1
3
2
kappa
3
kappa
1
2
3 kappa
TE
D
Figure 12: Results obtained with the 2 and 3 component mixtures in the Gaussian version of the MD2 model applied to FER2 data set. HPD at 95% (dark green) and 68% (light green) and Bayes estimator (blue point) for: the mean, the standard deviation and the weight (from left to right) of the first component. Gaussian version of MD2 . The MCMC algorithm converges for a mix-
AC C
EP
ture of 2, 3, 4 and 5 Gaussian distributions. The corresponding estimated densities are drawn in Figure 13, and point estimates and HPD at 95% and 68% of the mean, standard deviation and weight of the component of equivalent doses of well bleached grains are plotted in Figure 14. The Bayes estimators of the palaeodose D are similar for all mixture models (∼ 2.9), but the dispersion σD differs slightly (0.76, 0.79, 0.82, 0.81). According to the KoBIC, a mixture of three Gaussian distributions appears to be the best model. The estimates of the palaeodose, dispersion (both values in Gy) b = 2.9, σ and proportion values for this model are D bD = 0.79 and pˆ1 = 0.36 (see Table 1).
Log-normal version of MD2 . The MCMC algorithm convergences for a mixture of 2 and 3 log-normal distributions. The estimated densities are drawn in Figure 15, and point estimates and HPD at 95% and 68% of the parameters of the first log-normal distribution (to describe the equivalent doses of well bleached grains) are plotted in Figure 16. It should be noted that, in Figure 16, the estimates of the parameters of the first component do not correspond to the palaeodose estimate. We obtain the following estimates of the palaeodose and dispersion parameters (in Gy, according to what is suggested in Section 2.2.2) as well as of the
23
0.20
ACCEPTED MANUSCRIPT
0.15
mixture of 2 Gaussian distributions
RI PT
mixture of 3 Gaussian distributions
density 0.10
mixture of 4 Gaussian distributions
0.00
M AN U
SC
0.05
mixture of 5 Gaussian distributions
0
20
40 60 80 equivalent doses (Gy)
100
120
mixture of 2 Gaussian
mixture of 3 Gaussian
D
Figure 13: Histogram of De values for sample BDX17688 (size N = 104). The estimated density of the dataset using MD2 for a mixture of κ ∈ {2, 3, 4, 5} Gaussian distributions are shown in yellow, light green, dark green and black.
b = 3.3, D σc D = 1.8,
TE
proportion (see Table 1):
HP D(68%) = [1.3, 2],
HP D(95%) = [2.7, 3.9], HP D(95%) = [1.1, 2.6],
HP D(68%) = [0.38, 0.48],
EP
pˆ1 = 0.43,
HP D(68%) = [2.9, 3.5],
(11)
HP D(95%) = [0.32, 0.52].
Choice of parametric family. The palaeodose estimates are different if
AC C
b = 2.9) or a log-normal mixture (D b = 3.3). we consider a Gaussian mixture (D This is also the case for the standard deviation and the proportion of De estimates of well bleached grains. It is then important to determine which of this two parametric families better describes the data. Let us assume that the data are distributed according to the mixture model defined in Eq. (7), i.e. that the data can be described by a three-component gaussian mixture with probability ρ1 or by a two-component log-normal mixture with probability ρ2 = 1−ρ1 . Posterior distributions of the weights (ρ1 , ρ2 ) are plotted in Figure 17. The posterior distribution of the weights of the Gaussian mixture, ρ1 , is equal to 0 for small values and has increasing probability in the neighborhood of 1 and the mean of the posterior distribution is equal to 0.76. It is the reverse for the posterior distribution of the weight of the log-normal mixture (ρ2 ) and the mean of the posterior distribution is equal to 0.24. Therefore, this model argues in favor of the gaussian mixture model, with palaeodose b = 2.9 Gy. estimate D
24
2
1.1
2 2
2
2
2
2
0.45
3.2
ACCEPTED MANUSCRIPT
2
2
2
2
1
RI PT
0.40
SC
Probability 0.35 0.30
M AN U
0.6 1
1
1 1
2
3 4 kappa
0.5
2.6
2.7
2.8
Dose (Gy) 2.9
Standar deviation (Gy) 0.7 0.8 0.9
3.0
1.0
3.1
2
5
1
1
1
2
3 4 kappa
5
1
1 1
1
2
3 4 kappa
5
TE
D
Figure 14: Results obtained with the 2, 3, 4 and 5 component mixtures in the Gaussian version of the MD2 model applied to BDX17688 dataset. HPD at 95% (dark green) and 68% (light green) and Bayes estimator (blue point) for: the mean, the standard deviation and the weight (from left to right) of the first component of the model.
AC C
EP
This estimate of the palaeodose leads to an age of about 2500 years before present, whereas from the archaeological analysis and SG OSL dates obtained from mortars taken from the corresponding masonries (Guibert et al., b), an age of 1000-1200 years is expected. This apparent age is of course overestimated for a medieval mortar, but these results are consistent with the assumption A2 about the nature of the yellowish layer the sample was taken from.
25
0.15
ACCEPTED MANUSCRIPT
mixture of 2 log−normal distributions
0
M AN U
0.00
SC
0.05
density
0.10
RI PT
mixture of 3 log−normal distributions
20
40 60 80 equivalent doses (Gy)
100
120
Figure 15: Histogram of De values for sample BDX17688. The estimated density of the dataset using the MD2 for a mixture of κ = 2 Gaussian distributions is shown.
Discussion and conclusion
D
5
EP
TE
We have presented here a new model for the statistical analysis of OSL data from poorly-bleached samples. This Bayesian model, the MD2 , has been tested on synthetic and real De distributions; in both cases, we obtain consistent results in agreement with the expected values, when available.
No required input parameter. One of the main differences between
AC C
currently existing models for poorly bleached samples and our model lies in the need to define an input parameter in the former models, such as the IEU and the MDM. This input parameter is an expected dispersion value characteristic
Expected
Sample Synthetic old
Synthetic young FER2 BDX17688
Dose estimates from Gaussian family (Gy)
Dose estimates from log-normal family (Gy)
palaeodose (Gy)
Bayes estimate
HPD at 68%
HPD at 95%
Bayes estimate
HPD at 68%
HPD at 95%
57
58
[56,59]
[55,61]
59
[56,61]
[54,64]
1
1.06
[1.02,1.1]
[0.97,1.14]
∼79
80
[76,86]
[68,90]
A2 <10
2.9
[2.7,3]
[2.6,3.2]
N/A (negative values) MCMC do not converge 3.3
[2.9,3.5]
[2.7,3.9]
A1 <1.2
Table 1: Summary of palaeodose estimates obtained with MD2 model (Gaussian and log-normal version when it was possible). Bold values are those that are retained by our criteria. 26
0.65
ACCEPTED MANUSCRIPT
2
1.20
2
RI PT
0.45 Probability
0.40
M AN U
0.35
0.40
SC
0.45
1.00
η (without unit) 1.05 1.10
τ (without unit) 0.50 0.55
1.15
0.60
2
0.95 0.90
2
0.50
2
2
1
1 1
2
1
3
2
kappa
3
kappa
1
1
2
3 kappa
TE
D
Figure 16: Results obtained with the 2 and 3 component mixtures in the log-normal version of the MD2 model applied to BDX17688 dataset. HPD at 95% (dark green) and 68% (light green) and Bayes estimator (blue point) for: the location parameter, the scale parameter and the weight (from left to right) of the first component of the model.
AC C
EP
of well bleached distributions. In the MDM case, a log-normal distribution with a dispersion equal to this input value is fitted to the lower De values, while in the IEU the average dose is calculated from the n sorted De values for which the dispersion is equal to the input parameter. A clear drawback of this approach is that it is most often very difficult to obtain an accurate estimate for this input parameter, since (i) dose recovery populations will lead to underestimated values and (ii) when contemporaneous, well-bleached analogues are available, there is no guarantee that the dispersion in single grain dose rates for the well bleached sample will be relevant for the poorly bleached one (in particular if grain size distributions and/or mineralogical compositions vary from one sample to the other, cf. Guérin et al., 2015c). In addition, such an input parameter may lead to arbitrarily cutting the mode of the distribution in low De values; this may lead to underestimated ages. In contrast, the MD2 will reflect more faithfully the observed De values. Furthermore, we see the absence of a required, potentially subjective input parameter as a great strength of the MD2 model.
Convergence of MCMC algorithm. It can happen that the MCMC algorithm does not converge. For example, for sample FER2, the algorithm does not converges for any mixture of log-normal distributions; it does not con-
27
ACCEPTED MANUSCRIPT
2.5 rho_1 rho_2
RI PT
2.0
density
1.5
SC
1.0
0.0 0.00
0.25
M AN U
0.5
0.50 weight
0.75
1.00
D
Figure 17: The posterior distribution of the weight of the Gaussian mixture component is shown in pink, while the posterior distribution of the weight of the log-normal mixture component is shown in blue.
AC C
EP
TE
verge either for a mixture of more than three Gaussian distributions. It could be that, if we had convergence for a mixture of four Gaussian distributions, the mean of the first component would be smaller than 80. However, we observe that the mean of the first component of a mixture remains stable when the number of components increases, as we can see for sample BDX17688. Other MCMC algorithms (not implemented in JAGS), such as the Hamiltonian Algorithm (for more information, we refer to Brooks et al., 2011), could be implemented to solve these numerical problems.
Expansion of MD2 . We argue that our Mixture Distribution Models for Dose (MD2 ) can be easily adpated for other types of sample. For example, for samples composed of grains that have been fully bleached at deposition and mixed subsequently with younger intrusive grains; or samples where modern sunlight was able to penetrate the outer few millimeters, resulting in complete or partial bleaching of some grains contemporaneously with sampling (Olley et al., 2006). For such samples it is common to use the Maximum Age Model (Galbraith and Roberts, 2012), or the Finite Mixture Model (Roberts et al., 2000), to resolve post-depositional mixing. Nevertheless, further tests of the MD2 will be needed to test this assumption.
Age computation in Bayesian setting. The MD2 provides a palaeodose estimate for poorly bleached samples, but the sample age also depends on
28
ACCEPTED MANUSCRIPT
RI PT
its dose rate. As in (Combès et al., 2015) Section 4.4, to compute the sample age, we consider that the dose rate is known without error. But to take full avantage of the Bayesian setting, we refer to (Combès and Philippe, 2017) who proposed a model to compute age in stratigraphic constraints, taking into account systematic and random errors. To conclude, our model comes in addition to other Bayesian models including (Combès et al., 2015; Guérin et al., 2015a; Combès and Philippe, 2017) and should enable the computation of ages for all sample types (both well and poorly bleached) in stratigraphic constraints.
Acknowledgments
M AN U
SC
The first author is supported by the LabEx Sciences archéologiques de Bordeaux, LaScArBx-ANR-10-LABX-52. We wish to thank Philippe Lanos, Chantal Tribolo and Christelle Lahaye for discussion and suggestions. We are grateful to Jean Baptiste Javel and Petra Urbanova for mortar study of Saint Jean Baptiste of Périgueux and data collection, and the Région Aquitaine that cofinancing the MoDAq programme (Mortar Dating in Aquitaine, programme devoted to the chronological study of ancient buildings).
References
D
Brooks, S., Gelman, A., Jones, G. L., and Meng, X.-L. (2011). Handbook of markov chain monte carlo. Chapman and Hall/CRC.
TE
Buck, C. E., Cavanagh, W. G., and Litton, C. D. (1996). Bayesian approach to interpreting archaeological data. Wiley.
EP
Buck, C. E., Kenworthy, J. B., Litton, C. D., and Smith, A. F. M. (1991). Combining archaeological and radiocarbon information: a Bayesian approach to calibration. Antiquity, 65(249):808–821.
AC C
Carlin, B. P. and Louis, T. A. (1997). Bayes and empirical Bayes methods for data analysis. Springer. Combès, B. and Philippe, A. (2017). Bayesian analysis of individual and systematic multiplicative errors for estimating ages with stratigraphic constraints in optically stimulated luminescence dating. Quaternary Geochronology, 39:24–34. Combès, B., Philippe, A., Lanos, P., Mercier, N., Tribolo, C., Guérin, G., Guibert, P., and Lahaye, C. (2015). A Bayesian central equivalent dose model for optically stimulated luminescence dating. Quaternary Geochronology, 28:62–70. Cunningham, A., Wallinga, J., Hobo, N., Versendaal, A., Makaske, B., and Middelkoop, H. (2014). Does deposition depth control the OSL bleaching of fluvial sediment? Earth Surface Dynamics Discussions, 2(2):575–603. Duller, G. (2015). Analyst v4. 31.7 user manual. Technical report, Tech. Rep.
29
ACCEPTED MANUSCRIPT
Duller, G., Bøtter-Jensen, L., Murray, A., and Truscott, A. (1999). Single grain laser luminescence (SGLL) measurements using a novel automated reader. Nuclear instruments and methods in physics research Section B: beam interactions with materials and atoms, 155(4):506–514.
RI PT
Ferguson, T. S. (1973). A Bayesian analysis of some nonparametric problems. The annals of statistics, pages 209–230.
Gaillard, H., Boyer-Gardner, D., and Réveillas, H. (2008). La Chapelle SaintJean-Baptiste à Périgueux (Dordogne). Archaeological report 2007-2008 to the Regional Archaeological Service of the French Ministry of Culture and Communication, page 254.
SC
Galbraith, R. F. and Roberts, R. G. (2012). Statistical aspects of equivalent dose and error calculation and display in OSL dating: an overview and some recommendations. Quaternary Geochronology, 11:1–27.
M AN U
Galbraith, R. F., Roberts, R. G., Laslett, G. M., Yoshida, H., and Olley, J. M. (1999). Optical dating of single and multiple grains of quartz from Jinmium rock shelter, northern Australia: part i, experimental design and statistical models*. Archaeometry, 41(2):339–364. Gelman, A. and Rubin, D. B. (1992). Inference from iterative simulation using multiple sequences. Statistical science, pages 457–472.
D
Ghosh, J. and Ramamoorthi, R. (2003). Bayesian Nonparametrics. Springer Series in Statistics.
TE
Gill, J. (2014). Bayesian methods: A social and behavioral sciences approach, volume 20. CRC press.
AC C
EP
Guérin, G., Christophe, C., Philippe, A., Murray, A., Thomsen, K., Tribolo, C., Urbanova, P., Jain, M., Guibert, P., Mercier, N., et al. (2017). Absorbed dose, equivalent dose, measured dose rates, and implications for OSL age estimates: Introducing the Average Dose Model. In press. Quaternary Geochronology. Guérin, G., Combès, B., Lahaye, C., Thomsen, K. J., Tribolo, C., Urbanova, P., Guibert, P., Mercier, N., and Valladas, H. (2015a). Testing the accuracy of a Bayesian central-dose model for single-grain OSL, using known-age samples. Radiation Measurements, 81:62–70. Guérin, G., Frouin, M., Talamo, S., Aldeias, V., Bruxelles, L., Chiotti, L., Dibble, H. L., Goldberg, P., Hublin, J.-J., Jain, M., Lahaye, C., Madelaine, S., Maureille, B., McPherron, S. J., Mercier, N., Murray, A. S., Sandgathe, D., Steele, T. E., Thomsen, K. J., and Turq, A. (2015b). A multi-method luminescence dating of the Palaeolithic sequence of La Ferrassie based on new excavations adjacent to the La Ferrassie 1 and 2 skeletons. Journal of Archaeological Science, 58:147–166. Guérin, G., Jain, M., Thomsen, K. J., Murray, A. S., and Mercier, N. (2015c). Modelling dose rate to single grains of quartz in well-sorted sand samples:
30
ACCEPTED MANUSCRIPT
The dispersion arising from the presence of potassium feldspars and implications for single grain OSL dating. Quaternary Geochronology, 27:52–65.
RI PT
Guibert, P., Christophe, C., Urbanova, P., Guérin, G., and Blain, S. Modeling incomplete and heterogeneous bleaching of mobile grains partially exposed to the light: towards a new tool for single grain OSL dating of poorly bleached mortars. Submitted to Radiation Measurement.
Guibert, P., Javel, J.-B., Urbanova, P., and Gauillard, H. Saint Jean Baptiste de la Cité, Périgueux, Dordogne: Analyse chronologique. In press to Bilan Scientifique Régional Nouvelle Aquitaine du Service Régional de l’Archéologie.
SC
Huntley, D. J., Godfrey-Smith, D. I., and Thewalt, M. L. W. (1985). Optical dating of sediments. Nature, 313:107.
M AN U
Huntriss, A. (2008). A Bayesian analysis of luminescence dating. PhD thesis, Durham University. Ishwaran, H. and James, L. F. (2002). Approximate Dirichlet Process Computing in Finite Normal Mixtures. Journal of Computational and Graphical Statistics, 11(3):508–532. Kamary, K., Mengersen, K., Robert, C. P., and Rousseau, J. (2014). Testing hypotheses via a mixture estimation model. arXiv preprint arXiv:1412.2044.
TE
D
Marin, J.-M., Mengersen, K., and Robert, C. P. (2005). Bayesian modelling and inference on mixtures of distributions. Handbook of statistics, 25(16):459–507.
EP
McLachlan, G. and Peel, D. (2000). Finite mixture models. Wiley Series in Probability and Statistics: Applied Probability and Statistics. WileyInterscience, New York. Millard, A. R. (2006a). Bayesian analysis of ESR dates, with application to Border Cave. Quaternary Geochronology, 1(2):159–166.
AC C
Millard, A. R. (2006b). Bayesian analysis of pleistocene chronometric methods. Archaeometry, 48(2):359–375. Murray, A., Thomsen, K. J., Masuda, N., Buylaert, J.-P., and Jain, M. (2012). Identifying well-bleached quartz using the different bleaching rates of quartz and feldspar luminescence signals. Radiation Measurements, 47(9):688–695.
Murray, A. S. and Wintle, A. G. (2000). Luminescence dating of quartz using an improved single-aliquot regenerative-dose protocol. Radiation Measurements, 32(1):57 – 73. Olley, J. M., Roberts, R. G., Yoshida, H., and Bowler, J. M. (2006). Singlegrain optical dating of grave-infill associated with human burials at Lake Mungo, Australia. Quaternary Science Reviews, 25(19):2469–2474.
31
ACCEPTED MANUSCRIPT
Plummer, M. (2003). JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling. In Proceedings of the 3rd international workshop on distributed statistical computing, volume 124, page 125. Technische Universit at Wien Wien, Austria.
RI PT
Plummer, M. (2013). rjags: Bayesian Graphical Models using MCMC. URL http://CRAN. R-project. org/package= rjags. R package version, 2:0–4. Plummer, M. (2015). JAGS Version 4.0. 0 user manual.
Plummer, M., Best, N., Cowles, K., and Vines, K. (2006). CODA: Convergence diagnosis and output analysis for MCMC. R news, 6(1):7–11.
M AN U
SC
Rhodes, E., Ramsey, C. B., Outram, Z., Batt, C., Willis, L., Dockrill, S., and Bond, J. (2003). Bayesian methods applied to the interpretation of multiple OSL dates: high precision sediment ages from Old Scatness Broch excavations, Shetland Isles. Quaternary Science Reviews, 22(10):1231–1244. Robert, C. P. (2001). The Bayesian choice: From decision-theoretic foundations to computational implementation. Springer-Verlag, New York. Roberts, R., Bird, M., Olley, J., Galbraith, R., Lawson, E., Laslett, G., Yoshida, H., Jones, R., Fullagar, R., Jacobsen, G., and Hua, Q. (1998). Optical and radiocarbon dating at Jinmium rock shelter in northern Australia. Nature, 393(6683):358–362.
TE
D
Roberts, R. G., Galbraith, R. F., Yoshida, H., Laslett, G. M., and Olley, J. M. (2000). Distinguishing dose populations in sediment mixtures: a test of single-grain optical dating procedures using mixtures of laboratory-dosed quartz. Radiation Measurements, 32(5):459–465.
EP
Schwarz, G. (1978). Estimating the dimension of a model. The annals of statistics, 6(2):461–464.
AC C
Sivia, D. S., Burbidge, C., Roberts, R. G., and Bailey, R. M. (2004). A Bayesian approach to the evaluation of equivalent doses in sediment mixtures for luminescence dating. Bayesian Inference and Maximum Entropy Methods in Science and Engineering, 735:305–311. Thomsen, K. J., Murray, A., Bøtter-Jensen, L., and Kinahan, J. (2007). Determination of burial dose in incompletely bleached fluvial samples using single grains of quartz. Radiation Measurements, 42(3):370–379.
Zhang, H. and Yangxin, H. (2015). Finite mixture models and their applications: a review. Austin Biometrics and Biostatistics, 2(1):1013. Zink, A. (2013). A coarse Bayesian approach to evaluate luminescence ages. Geochronometria, 40(2):90–100. Zink, A. (2015). Bayesian analysis of luminescence measurements. Radiation Measurements, 81:71–77.
32
ACCEPTED MANUSCRIPT
References Brooks, S., Gelman, A., Jones, G. L., and Meng, X.-L. (2011). Handbook of markov chain monte carlo. Chapman and Hall/CRC.
RI PT
Buck, C. E., Cavanagh, W. G., and Litton, C. D. (1996). Bayesian approach to interpreting archaeological data. Wiley. Buck, C. E., Kenworthy, J. B., Litton, C. D., and Smith, A. F. M. (1991). Combining archaeological and radiocarbon information: a Bayesian approach to calibration. Antiquity, 65(249):808–821.
SC
Carlin, B. P. and Louis, T. A. (1997). Bayes and empirical Bayes methods for data analysis. Springer.
M AN U
Combès, B. and Philippe, A. (2017). Bayesian analysis of individual and systematic multiplicative errors for estimating ages with stratigraphic Quaternary constraints in optically stimulated luminescence dating. Geochronology, 39:24–34. Combès, B., Philippe, A., Lanos, P., Mercier, N., Tribolo, C., Guérin, G., Guibert, P., and Lahaye, C. (2015). A Bayesian central equivalent dose model for optically stimulated luminescence dating. Quaternary Geochronology, 28:62–70.
D
Cunningham, A., Wallinga, J., Hobo, N., Versendaal, A., Makaske, B., and Middelkoop, H. (2014). Does deposition depth control the OSL bleaching of fluvial sediment? Earth Surface Dynamics Discussions, 2(2):575–603.
TE
Duller, G. (2015). Analyst v4. 31.7 user manual. Technical report, Tech. Rep.
EP
Duller, G., Bøtter-Jensen, L., Murray, A., and Truscott, A. (1999). Single grain laser luminescence (SGLL) measurements using a novel automated reader. Nuclear instruments and methods in physics research Section B: beam interactions with materials and atoms, 155(4):506–514.
AC C
Ferguson, T. S. (1973). A Bayesian analysis of some nonparametric problems. The annals of statistics, pages 209–230.
Gaillard, H., Boyer-Gardner, D., and Réveillas, H. (2008). La Chapelle SaintJean-Baptiste à Périgueux (Dordogne). Archaeological report 2007-2008 to the Regional Archaeological Service of the French Ministry of Culture and Communication, page 254. Galbraith, R. F. and Roberts, R. G. (2012). Statistical aspects of equivalent dose and error calculation and display in OSL dating: an overview and some recommendations. Quaternary Geochronology, 11:1–27.
Galbraith, R. F., Roberts, R. G., Laslett, G. M., Yoshida, H., and Olley, J. M. (1999). Optical dating of single and multiple grains of quartz from Jinmium rock shelter, northern Australia: part i, experimental design and statistical models*. Archaeometry, 41(2):339–364.
33
ACCEPTED MANUSCRIPT
Gelman, A. and Rubin, D. B. (1992). Inference from iterative simulation using multiple sequences. Statistical science, pages 457–472. Ghosh, J. and Ramamoorthi, R. (2003). Bayesian Nonparametrics. Springer Series in Statistics.
RI PT
Gill, J. (2014). Bayesian methods: A social and behavioral sciences approach, volume 20. CRC press.
SC
Guérin, G., Christophe, C., Philippe, A., Murray, A., Thomsen, K., Tribolo, C., Urbanova, P., Jain, M., Guibert, P., Mercier, N., et al. (2017). Absorbed dose, equivalent dose, measured dose rates, and implications for OSL age estimates: Introducing the Average Dose Model. In press. Quaternary Geochronology.
M AN U
Guérin, G., Combès, B., Lahaye, C., Thomsen, K. J., Tribolo, C., Urbanova, P., Guibert, P., Mercier, N., and Valladas, H. (2015a). Testing the accuracy of a Bayesian central-dose model for single-grain OSL, using known-age samples. Radiation Measurements, 81:62–70.
D
Guérin, G., Frouin, M., Talamo, S., Aldeias, V., Bruxelles, L., Chiotti, L., Dibble, H. L., Goldberg, P., Hublin, J.-J., Jain, M., Lahaye, C., Madelaine, S., Maureille, B., McPherron, S. J., Mercier, N., Murray, A. S., Sandgathe, D., Steele, T. E., Thomsen, K. J., and Turq, A. (2015b). A multi-method luminescence dating of the Palaeolithic sequence of La Ferrassie based on new excavations adjacent to the La Ferrassie 1 and 2 skeletons. Journal of Archaeological Science, 58:147–166.
EP
TE
Guérin, G., Jain, M., Thomsen, K. J., Murray, A. S., and Mercier, N. (2015c). Modelling dose rate to single grains of quartz in well-sorted sand samples: The dispersion arising from the presence of potassium feldspars and implications for single grain OSL dating. Quaternary Geochronology, 27:52–65.
AC C
Guibert, P., Christophe, C., Urbanova, P., Guérin, G., and Blain, S. Modeling incomplete and heterogeneous bleaching of mobile grains partially exposed to the light: towards a new tool for single grain OSL dating of poorly bleached mortars. Submitted to Radiation Measurement. Guibert, P., Javel, J.-B., Urbanova, P., and Gauillard, H. Saint Jean Baptiste de la Cité, Périgueux, Dordogne: Analyse chronologique. In press to Bilan Scientifique Régional Nouvelle Aquitaine du Service Régional de l’Archéologie. Huntley, D. J., Godfrey-Smith, D. I., and Thewalt, M. L. W. (1985). Optical dating of sediments. Nature, 313:107. Huntriss, A. (2008). A Bayesian analysis of luminescence dating. PhD thesis, Durham University. Ishwaran, H. and James, L. F. (2002). Approximate Dirichlet Process Computing in Finite Normal Mixtures. Journal of Computational and Graphical Statistics, 11(3):508–532.
34
ACCEPTED MANUSCRIPT
Kamary, K., Mengersen, K., Robert, C. P., and Rousseau, J. (2014). Testing hypotheses via a mixture estimation model. arXiv preprint arXiv:1412.2044. Marin, J.-M., Mengersen, K., and Robert, C. P. (2005). Bayesian modelling and inference on mixtures of distributions. Handbook of statistics, 25(16):459–507.
RI PT
McLachlan, G. and Peel, D. (2000). Finite mixture models. Wiley Series in Probability and Statistics: Applied Probability and Statistics. WileyInterscience, New York.
Millard, A. R. (2006a). Bayesian analysis of ESR dates, with application to Border Cave. Quaternary Geochronology, 1(2):159–166.
SC
Millard, A. R. (2006b). Bayesian analysis of pleistocene chronometric methods. Archaeometry, 48(2):359–375.
M AN U
Murray, A., Thomsen, K. J., Masuda, N., Buylaert, J.-P., and Jain, M. (2012). Identifying well-bleached quartz using the different bleaching rates of quartz and feldspar luminescence signals. Radiation Measurements, 47(9):688–695. Murray, A. S. and Wintle, A. G. (2000). Luminescence dating of quartz using an improved single-aliquot regenerative-dose protocol. Radiation Measurements, 32(1):57 – 73.
D
Olley, J. M., Roberts, R. G., Yoshida, H., and Bowler, J. M. (2006). Singlegrain optical dating of grave-infill associated with human burials at Lake Mungo, Australia. Quaternary Science Reviews, 25(19):2469–2474.
EP
TE
Plummer, M. (2003). JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling. In Proceedings of the 3rd international workshop on distributed statistical computing, volume 124, page 125. Technische Universit at Wien Wien, Austria. Plummer, M. (2013). rjags: Bayesian Graphical Models using MCMC. URL http://CRAN. R-project. org/package= rjags. R package version, 2:0–4.
AC C
Plummer, M. (2015). JAGS Version 4.0. 0 user manual. Plummer, M., Best, N., Cowles, K., and Vines, K. (2006). CODA: Convergence diagnosis and output analysis for MCMC. R news, 6(1):7–11.
Rhodes, E., Ramsey, C. B., Outram, Z., Batt, C., Willis, L., Dockrill, S., and Bond, J. (2003). Bayesian methods applied to the interpretation of multiple OSL dates: high precision sediment ages from Old Scatness Broch excavations, Shetland Isles. Quaternary Science Reviews, 22(10):1231–1244. Robert, C. P. (2001). The Bayesian choice: From decision-theoretic foundations to computational implementation. Springer-Verlag, New York.
Roberts, R., Bird, M., Olley, J., Galbraith, R., Lawson, E., Laslett, G., Yoshida, H., Jones, R., Fullagar, R., Jacobsen, G., and Hua, Q. (1998). Optical and radiocarbon dating at Jinmium rock shelter in northern Australia. Nature, 393(6683):358–362.
35
ACCEPTED MANUSCRIPT
Roberts, R. G., Galbraith, R. F., Yoshida, H., Laslett, G. M., and Olley, J. M. (2000). Distinguishing dose populations in sediment mixtures: a test of single-grain optical dating procedures using mixtures of laboratory-dosed quartz. Radiation Measurements, 32(5):459–465.
RI PT
Schwarz, G. (1978). Estimating the dimension of a model. The annals of statistics, 6(2):461–464.
Sivia, D. S., Burbidge, C., Roberts, R. G., and Bailey, R. M. (2004). A Bayesian approach to the evaluation of equivalent doses in sediment mixtures for luminescence dating. Bayesian Inference and Maximum Entropy Methods in Science and Engineering, 735:305–311.
SC
Thomsen, K. J., Murray, A., Bøtter-Jensen, L., and Kinahan, J. (2007). Determination of burial dose in incompletely bleached fluvial samples using single grains of quartz. Radiation Measurements, 42(3):370–379.
M AN U
Zhang, H. and Yangxin, H. (2015). Finite mixture models and their applications: a review. Austin Biometrics and Biostatistics, 2(1):1013. Zink, A. (2013). A coarse Bayesian approach to evaluate luminescence ages. Geochronometria, 40(2):90–100. Zink, A. (2015). Bayesian analysis of luminescence measurements. Radiation Measurements, 81:71–77.
TE
D
A JAGS implementation of Gaussian mixture model
EP
For the implementation, parameter name has changed: D is denoted by mu[1] and σD , σ by sd[1],sd[2], and the vector (p1 , ..., pκ ) is denoted by proba.
AC C
1 model { 2 # N i s t h e sample s i z e 3 # D [ 1 ] , . . . , D[N] a r e o b s e r v e d d a t a (De v a l u e s ) 4 # N c l u s t e r i s t h e number o f component i n t h e m i x t u r e 5 # w e i g h t i s a v e c t o r o f r e a l o f 1/ N c l u s t e r o f s i z e N c l u s t e r 6 7 # Likelihood : 8 for ( i i n 1 : N ) { 9 D[ i ] ~ dnorm(mu[ Z [ i ] ] , p r e c i s i o n [ Z [ i ] ] ) 10 Z [ i ] ~ d c a t ( proba ) # unobserved v a r i a b l e s 11 } 12 13 # P r i o r d i s t r i b u t i o n o f mean p a r a m e t e r s : 14 mu [ 1 ] ~dnorm( 2 0 0 , 2E−5)T( 0 , ) 15 for ( c l u s t I d x i n 2 : N c l u s t e r ) { 16 mu[ c l u s t I d x ] ~dnorm( 2 0 0 , 2 . 0 E−5)T(mu [ ( c l u s t I d x − 1 ) ] , ) 17 } 18 19 # Prior d i s t r i b u t i o n of standard d e v i a t i o n parameters :
36
ACCEPTED MANUSCRIPT
p r e c i s i o n [ 1 ] ~dgamma( 0 . 1 , 0 . 0 0 0 1 ) sd [ 1 ]<−sqrt ( 1 / p r e c i s i o n [ 1 ] ) p r e c i s i o n_temp~dgamma( 0 . 1 , 0 . 0 0 0 1 ) sd [ 2 ]<−sqrt ( 1 / p r e c i s i o n_temp ) for ( c l u s t I d x i n 2 : N c l u s t e r ) { p r e c i s i o n [ c l u s t I d x ]<−p r e c i s i o n_temp }
RI PT
20 21 22 23 24 25 26 27 28 29 30
# p r i o r d i s t r i b u t i o n of weight parameters : proba~d d i r c h ( w e i g h t ) }
SC
B JAGS implementation of log-normal mixture model
Implementation of the model selection
D
C
M AN U
We do not detail the JAGS implementation of the log-normal mixture model, since it differs from the Gaussian model only in line 14. Indeed, in the lognormal version, the parameter mu[1] is not truncated to 0, you may replace mu[1] ∼ dnorm(200, 2E − 5)T (0, ) by mu[1] ∼ dnorm(200, 2E − 5). In addition, instead of intering De observations for D, the user inters the logarithm of the data (log(De )).
AC C
EP
TE
We present a R code to implement the model selection between the best model of each parametric families: Gaussian mixtures family and log-normal mixtures family. Let D be the vector of equivalent doses. Let kappa_n be the component number selected by the KoBIC for the Gaussian mixture family, and kappa_ln be the component number selected by the KoBIC for the lognormal mixture family. Let RES be the vector of bayes estimates of parameters of a mixture of kappa_n Gaussian distributions. This vector is organized as follow: the kappa_n first values are dedicated to mean estimates, the two following values are variance estimates, the last kappa_n values are weight estimates. Let RES_log be the vector of bayes estimates of parameters of a mixture of kappa_ln log-normal distributions. This vector has similar organisation than the vector RES.
1 2 3 4 5 6 7 8 9 10 11
D<−sort (D) N=length (D) N1=50000 #−−− g a u s i a n m i x t u r e l i k e l i h o o d f n=RES [ kappa_n+3]∗dnorm(D, RES [ 1 ] , RES [ kappa_n +1]) for ( k i n 2 : kappa_n ) { f n=f n+RES [ kappa_n+2+k ] ∗dnorm(D, RES [ k ] , RES [ kappa_n +2]) } #−−− l o g −normale m i x t u r e l i k e l i h o o d f l n=RES_log [ kappa_l n +3]∗ dlnorm (D, RES_log [ 1 ] , RES_log [ kappa_l n +1])
37
ACCEPTED MANUSCRIPT
M AN U
SC
RI PT
for ( k i n 2 : kappa_l n ) { f l n=f l n+RES_log [ kappa_l n+2+k , kappa_l n ] ∗ dlnorm (D, RES_log [ k ] , RES_log [ kappa_l n +2]) } # co mp u ta t io n o f t h e p o s t e r i o r d i s t r i b u t i o n o f t h e w e i g h t p=matrix ( data=NA, ncol =2,nrow=N1) p [1 ,]= r d i r i c h l e t ( c (1/2 ,1/ 2)) Z=matrix ( data=0, ncol=N, nrow=N1) for ( i i n 1 :N) { u=runif ( 1 , 0 , 1 ) Z [ 1 , i ]=sum( u<(p [ 1 , 2 ] ∗ f l n [ i ] / ( p [ 1 , 2 ] ∗ f l n [ i ]+p [ 1 , 1 ] ∗ f n [ i ] ) ) ) } nb0=rep ( 0 , N1) nb0 [ 1 ] =sum(Z[ 1 , ] = = 0 ) pb=t x t P r o g r e s s B a r (min=2,max=N1 , s t y l e =3) for ( t i n 2 : N1) { s e t T x t P r o g r e s s B a r ( pb , t ) p [ t , 1 ] = rbeta ( 1 , ( 1 /2+nb0 [ ( t − 1 ) ] ) , ( 1 /2+N−nb0 [ t − 1 ] ) ) p [ t ,2]=1 −p [ t , 1 ] for ( i i n 1 :N) { u=runif ( 1 , 0 , 1 ) Z [ t , i ]=sum( u<(p [ t , 2 ] ∗ f l n [ i ] / ( p [ t , 2 ] ∗ f l n [ i ]+p [ t , 1 ] ∗ f n [ i ] ) ) ) } nb0 [ t ]=sum(Z [ t , ] < 1 ) } # bayes e s t i m a t e s of the weight parameters mean( p [ ( N1/ 2 ) : N1 , 1 ] ) mean( p [ ( N1/ 2 ) : N1 , 2 ] )
AC C
EP
TE
D
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
38
ACCEPTED MANUSCRIPT
Claire Christophe
, Anne Philippeb , Guillaume Guérina , Norbert Merciera and Pierre Guiberta
IRAMAT-CRP2A, Maison de l’archéologie, Université de Bordeaux Montaigne, Esplanade des Antilles 33607 Pessac Cedex, France. b Laboratoire de Mathématique Jean Leray, Université de Nantes, 2 Chemin de la Houssinière, 44322 Nantes, France.
M AN U
SC
a
ab
RI PT
Bayesian approach to OSL dating of poorly bleached sediment samples: Mixture Distribution Models for Dose (MD2)
June 3, 2017
• Mixture of a parametric distribution and a non parametric distribution is used to describe poorly bleached OSL measurement; • This model is developed in a Bayesian framework;
AC C
EP
TE D
• A JAGS implementation of the model is provided to be usable.
1