Computational Statistics and Data Analysis 71 (2014) 506–519
Contents lists available at ScienceDirect
Computational Statistics and Data Analysis journal homepage: www.elsevier.com/locate/csda
A Bayesian model for longitudinal circular data based on the projected normal distribution Gabriel Nuñez-Antonio a,b,∗ , Eduardo Gutiérrez-Peña c a
Department of Statistics, Universidad Carlos III de Madrid, Spain
b
Department of Mathematics, UAM-I, Mexico
c
Department of Probability and Statistics, IIMAS-UNAM, Mexico
article
info
Article history: Received 10 October 2011 Received in revised form 25 July 2012 Accepted 26 July 2012 Available online 3 August 2012 Keywords: Circular data Gibbs sampler Latent variables Longitudinal data Mixed-effects linear models Projected normal distribution
abstract The analysis of short longitudinal series of circular data may be problematic and to some extent has not been fully developed. A Bayesian analysis of a new model for such data is presented. The model is based on a radial projection onto the circle of a particular bivariate normal distribution. Inference about the parameters of the model is based on samples from the corresponding joint posterior density, which are obtained using a Metropolis-withinGibbs scheme after the introduction of suitable latent variables. The procedure is illustrated using both simulated data sets and a real data set previously analyzed in the literature. © 2012 Elsevier B.V. All rights reserved.
1. Introduction Several approaches have been proposed to analyze longitudinal data. See, for example, Diggle et al. (2002), Fitzmaurice et al. (2004), Gelman and Hill (2007), and Hedeker and Gibbons (2006). All these books discuss models for longitudinal ‘scalar’ (i.e. linear) responses. In contrast, methodological proposals to describe relationships within repeated measurements of directional data are rather limited. This may be due to the difficulties in working with probability distributions commonly associated with directional data and to the intrinsic dependency inherent to longitudinal structures. Circular data are a particular case of directional data. Specifically, circular data represent directions in two dimensions. For a survey of this and related topics, we refer the reader to Fisher (1993), Fisher et al. (1987), Jammalamadaka and SenGupta (2001) and Mardia and Jupp (2000). More recent contributions include Abe and Pewsey (2011), Oliveira et al. (2012), Pewsey (2008) and Qin et al. (2011). See also Arnold and SenGupta (2006) for an overview of applications of circular data analysis in ecological and environmental sciences. In many of these applications, the observations on the variable of interest are longitudinal in nature. For example, in studies concerning the orientation mechanism of birds, it is relevant to analyze the angular differences between a bird’s position at consecutive times after release (Artes and Jørgensen, 2000; Artes et al., 2000). In motion studies of small animals, the aim is often to describe the effect of covariates on the directional behavior of those animals (D’Elia, 2001; D’Elia et al., 2001). On the other hand, in the analysis of cell-cycle gene expression data, a problem of interest is to estimate the phase angles using information regarding the order among them (Rueda et al., 2009).
∗ Correspondence to: Departamento de Estadística, Universidad Carlos III de Madrid, C/Madrid 126, 28903, Getafe (Madrid), Spain. Tel.: +34 91 624 5713; fax: +34 91 624 9849. E-mail address:
[email protected] (G. Nuñez-Antonio). 0167-9473/$ – see front matter © 2012 Elsevier B.V. All rights reserved. doi:10.1016/j.csda.2012.07.025
G. Nuñez-Antonio, E. Gutiérrez-Peña / Computational Statistics and Data Analysis 71 (2014) 506–519
507
All of these situations illustrate the need for models that allow us to analyze longitudinal structures where the response variable is circular. As pointed out above, from a methodological point of view there does not seem to be a general framework for the analysis of longitudinal directional data. Longitudinal data where the response variable is circular have been analyzed using quasilikelihood methods, such as the generalized estimating equations (GEE) originally proposed by Liang and Zeger (1986) to analyze linear data. Specifically, Artes et al. (2000) derive estimating equations for the parameters of a family of circular distributions and obtain asymptotic inference for the parameters of a mixed-effects model. In turn, Artes and Jørgensen (2000) extend GEE methods to deal with Jørgensen’s dispersion models (Jørgensen, 1997a,b) and employ this approach to model longitudinal circular data. They also present a simulation study for a simple model which only involves the mean direction and a single covariate. They note that in some situations their proposal may have troubles with convergence, and point out that their method requires either high correlations between the longitudinal observations or large samples in order to achieve satisfactory performance. On the other hand, D’Elia et al. (2001) propose a generalized linear model to study the directional behavior of sandhoppers under natural conditions in repeated trials. In the same vein, D’Elia (2001) assumes a variance components model to describe the orientation mechanism and uses a simulated maximum likelihood approach. She points out that the use of this approach may raise several problems. Recently, Song (2007) has used a generalized linear model approach where the random component belongs to the family of dispersion models. He suggests using penalized pseudo-likelihood and restricted maximum likelihood estimation to bypass the analytical difficulties arising from the nonlinearity of the corresponding score functions. Nevertheless, in some cases it is not possible to make inferences about all the parameters involved in the proposed models. We feel that most of the procedures currently available for analyzing longitudinal data with circular responses suffer from certain flaws that render some of the required inferences unfeasible in general settings. These limitations include troubles for fitting, model comparison and prediction, as well as convergence problems of some of the iterative methods employed. Song (2007) argues that different approaches may be required to analyze series of repeated measurements, depending on the length of the series. In the case of (several) short time series, modeling typically focuses on the relationship between the response variable and the corresponding covariates. In such situations the correlations are treated as nuisance parameters, as opposed to the case of a long time series where the correlations are usually modeled explicitly via a stochastic process. In this paper, we introduce a new model to describe short series of longitudinal data where the response variable is circular. The model considers linear covariates and is based on a version of the projected bivariate normal distribution. In our proposal, each of the components of the model is specified by a mixed-effects linear model. In addition, we present a Bayesian analysis that allows us to make inference about any of the parameters of interest. The paper is organized as follows. In the next section, we introduce the projected circular longitudinal model (henceforth called the PCL model) and describe some of its properties, including the longitudinal structures that can be obtained from it. In Section 3, we discuss a Bayesian analysis of the model and derive all the full conditionals needed for a Gibbs sampler. In Section 4, we present some illustrative examples. Finally, Section 5 contains some concluding remarks. 2. The PCL model 2.1. Description of the model We start this section by briefly reviewing the projected normal distribution. For further details, we refer the reader to Mardia and Jupp (2000), Nuñez-Antonio and Gutiérrez-Peña (2005), Presnell et al. (1998), Presnell and Rumcheva (2008), and the references therein. There are several ways of generating probability distributions for circular data. One relatively straightforward way is to radially project on the unit circle probability distributions originally defined on the plane. In the general case, let Y be a q-dimensional random vector such that Pr(Y = 0) = 0. Then U = ∥Y ∥−1 Y is a random point on the q-dimensional unit sphere. Its mean direction is the unit vector η = E (U )/ρ , where ρ = ∥E (U )∥, 0 ≤ ρ ≤ 1; here E (·) represents the usual expectation for random vectors, and ∥ · ∥ denotes the usual Euclidean norm. The parameter ρ is called the mean resultant length and can be regarded as a measure of concentration. An important instance of this class of distributions is that in which Y has a q-variate normal distribution, Nq (·|µ, Λ), with mean vector µ = E (Y ) and precision matrix Λ = Var(Y )−1 . In this case U is said to have a q-dimensional projected normal distribution, here denoted by PN (·|µ, Λ). In the circular case, q = 2, U is a two-dimensional unit vector and so it can be alternatively specified by means of a single angle Θ , say. A version of the projected normal linear model for the circular case has been analyzed by Presnell et al. (1998) using a maximum likelihood approach. Nuñez-Antonio et al. (2011) present and discuss a Bayesian analysis of the same model. See also Nuñez-Antonio and Gutiérrez-Peña (2005). The aim of this work is to introduce a model to describe short series of longitudinal data, where the response is a circular variable Θ , in terms of one or more explanatory variables or covariates x = (x1 , . . . , xm )t . Even though the results presented here can be extended to other cases, only linear covariates will be considered. Suppose that, on each of a number of occasions j (j = 1, . . . , ni ), measurements are taken on the ith individual in the study (i = 1, . . . , N ) and arranged in an ni × 1 vector of responses θ i = (θi1 , . . . , θini )t . Thus, we have a design with N individuals and ni angular observations, θij , on each individual.
508
G. Nuñez-Antonio, E. Gutiérrez-Peña / Computational Statistics and Data Analysis 71 (2014) 506–519
Our model assumes that each observation Θij follows a projected normal distribution PN (·|µij , I ) with density function given by f (θij |µij , I ) =
1 2π
1
exp − ∥µij ∥ 2
2
vijt µij
1 +
φ(vijt µij )
Φ ( µij ) 1(0,2π] (θij ) 1R2 (µij ), vijt
(1)
where vijt = (cos θij , sin θij ), and µij defines the corresponding mean direction. Here, φ(·) and Φ (·) denote the probability density function and cumulative distribution function (respectively) of the standard normal distribution. In addition, for i = 1, . . . , N, and j = 1, . . . , ni , we assume the following structure for the vector µij :
µij =
µIij
=
µIIij
(xIij )t βI + (zijI )t bIi (xIIij )t βII + (zijII )t bIIi
.
Here, the vectors {βI , βII } are the coefficients of the model and the vectors {bIi , bIIi } represent subject-specific random effects, usually assumed to be normally distributed with mean vector 0 and precision matrix , say. The vectors zijI and zijII are part of the design matrix. Note that, in general, each of the two components of µij may depend on different subsets of covariates, in which case βI and βII may have different dimensions (the same holds for the vectors bIi and bIIi ). A first important step to construct the PCL model is to consider a data-augmented model via the introduction of a suitable set of latent variables, Rij , in such a way that
Yij =
YijI
= Rij ×
YijII
cos Θij sin Θij
,
i = 1, . . . , N , j = 1, . . . , ni ,
where Yij follows the bivariate normal distribution N2 (·|µij , I ) and Rij = ∥Yij ∥. Note that the joint density of (Θij , Rij ), denoted by f(Θij ,Rij ) (θij , rij ), can be obtained from that of Yij by transforming to polar coordinates. This joint distribution is constructed in such a way as to ensure that we can simulate from all the posterior conditional densities required for a Gibbs sampler (see Section 3.1). Note also that, by construction, the marginal density of Θij is precisely the projected normal density (1). In this setting, we can regard the Rij as missing data. The notion of latent variables is common in missing data problems. However, in some applications, latent variables may have a natural, context-specific interpretation. For example, Presnell et al. (1998) suggest that the vanishing directions (relative to the observer) of a particular species of butterflies might approximately follow a bivariate normal distribution. On the other hand, in studies concerning wind directions, the wind velocity is typically modeled through a bivariate normal distribution (see, for example Mardia and Jupp, 2000). Thus, with a slight abuse of notation, we can write f (yij = rij (cos θij , sin θij )t |βI , βII , {bi }I , {bi }II , xIij , xIIij , zijI , zijII ) = N2 (yij |µij , I ). We emphasize that, in the above specification, we are assuming a mixed-effects model for each of the two components of the PCL model. In matrix notation, YiI = XiI βI + ZiI bIi + εIi YiII
=
XiII
β + II
ZiII bIIi
and
+ε , II i
where and are vectors of dimensions nIi and nIIi , respectively (i = 1, . . . , N ). Recalling that the vectors bi represent subject-specific random effects, it may now be realistic to assume that, given the bi , the components of each of the vectors εIi and εIIi are mutually independent. This would allow us to set their covariance matrices to 6i = σ˜ 2 Ini , as in the original paper by Laird and Ware (1982). In fact, in our approach we will use the structure 6i = Ini so as to avoid identifiability problems (see, for example Nuñez-Antonio and Gutiérrez-Peña, 2005). Summing up, if we consider each of the two components k ∈ {I , II } separately, the hierarchical specification of the PCL model based on the augmented data is given as follows. YiI
YiII
• Stage 1. For each individual i, Yik | βk , {bki } ∼ Nni (Xik βk + Zik bki , Ini ),
i = 1, . . . , N .
Equivalently, given β and {bki }, k
Yik = Xik βk + Zik bki + εki ,
i = 1, . . . , N ,
where ε ∼ Nni (0, I ). • Stage 2. The vectors βk and bki are assumed independent, and k i
βk ∼ Npk (0, Ak ), bki |k ∼ Nqk (0, k ),
i = 1, . . . , N ,
where pk and qk denote the dimensions of βk and bki , respectively.
G. Nuñez-Antonio, E. Gutiérrez-Peña / Computational Statistics and Data Analysis 71 (2014) 506–519
509
Fig. 1. Some longitudinal patterns for circular data produced by the PCL model.
• Stage 3. k ∼ Wi(v k , Bk ),
v k ≥ qk ,
i.e., a Wishart distribution. In this parameterization E (k ) = v k (Bk )−1 . In terms of the augmented data, the PCL model is basically a standard bivariate mixed-effects model. Note also that we use conjugate priors on βI , βII , I and II to complete the specification of the model. 2.2. Longitudinal structures from the PCL model The PCL model is flexible enough to describe a variety of longitudinal patterns for short series of circular data. Fig. 1 shows some of these behaviors. It can be seen that the PCL model is able to reproduce a range of patterns, including random intercept, random slope and random intercept–slope, similar to those obtained from a standard (linear) mixed-effects model. The model can also produce more general dependence patterns such as that presented in the lower-right panel of Fig. 1. Covariance-pattern models (see, for example Hedeker and Gibbons, 2006) are commonly used to describe repeated measures and include covariance matrices with compound symmetry, first-order autoregressive, Toeplitz and banded structures, among others. These models do not necessarily distinguish between within-subject and between-subject variation. Instead, they assume specific forms for the variance–covariance matrix. In contrast, in the definition of the PCL model, the covariance matrix of the vector Yi (for each of the two components k ∈ {I , II }) is given by V (Yi ) = Zi −1 Zit + Ini .
(2)
We note that, unlike covariance-pattern models, in the case of the PCL model the variance–covariance matrix of the repeated measures Yi can be decomposed into between-subject and within-subject variation. As an example of what the relation (2) may imply, consider a model that includes a random intercept, a time trend, and three measurements (ni = 3) for each individual. The matrix Zi can then be specified as
Zi =
1 1 1
0 1 , 2
∀ i = 1, . . . , N ,
in which case the matrix V (Yi ) becomes
σ12 σ12 + σ12 σ12 + 2σ12
σ12 + σ12 2 σ1 + 2σ12 + σ22 σ12 + 3σ12 + 2σ22
σ12 + 2σ12 σ12 + 3σ12 + 2σ22 + I3 , σ12 + 4σ12 + 4σ22
510
G. Nuñez-Antonio, E. Gutiérrez-Peña / Computational Statistics and Data Analysis 71 (2014) 506–519 Table 1 Sample circular autocorrelation coefficients obtained from a PCL model. Occasion
1
2
3
4
5
1 2 3 4 5
1 – – – –
0.694 1 – – –
0.502 0.847 1 – –
0.363 0.791 0.937 1 –
0.269 0.738 0.927 0.971 1
Table 2 Circular autocorrelation coefficients corresponding to the pattern shown in the lower-right panel of Fig. 1. Occasion
1
2
3
4
5
1 2 3 4 5
1 – – – –
0.548 1 – – –
0.028 0.127 1 – –
−0.382 −0.193
−0.474 −0.363
0.482 1 –
0.312 0.661 1
where the parameters σ12 , σ22 , and σ12 are the elements of the matrix
−1 =
σ12 σ12
σ12 , σ22
and I3 is the identity matrix. Thus, it can be seen that, for specific values of the parameters σ12 , σ22 , and σ12 , and specific matrices Zi , a model with a random intercept and a random slope allows variances and covariances to change across time. Models with covariance matrices other than the identity matrix can allow for more general covariance structures than (2). For example, in order for the PCL model to describe some forms of autocorrelated errors, at least with regard to the complete data Yi , the identity matrix Ini in (2) must be replaced with a matrix of the form σ˜ 2 6i , so that V (Yi ) = Zi −1 Zit + σ˜ 2 6i .
(3)
In this way, if the matrix 6i in (3) is defined according to a covariance-pattern model, the PCL model could in principle describe autocorrelated behaviors similar to those from autoregressive or Toeplitz structures. However, as pointed out in Section 2.1, due to identifiability issues an appropriate value for the parameter σ˜ 2 in (3) is 1. As another illustration, suppose that the recording times of the responses are regarded as cyclical. In that case, it would be possible to define the matrix 6i in (3) in terms of the circular linear exponent autoregressive (LEAR) covariance pattern proposed by Simpson and Edwards (2011); see also Hartley and Naik (2001). Our version of the PCL model does not consider any specific covariance pattern (such as the ones discussed above) in the specification of the matrix 6i . Nevertheless, the model is still able to describe and produce a range of autocorrelation structures for the angular measures Θi similar to those from first-order autoregressive matrices, or resembling the pattern shown in the lower right panel of Fig. 1. Table 1 shows some sample circular autocorrelation coefficients obtained from a PCL model with a structure similar to that of Example 2 with βI = (100, −4)t , βII = (20, −10) and (II )−1 = Diag(5, 3). It can be seen that the values in Table 1 are comparable to those from a first-order autoregressive process with autocorrelation parameter equal to 0.7. Similarly, Table 2 shows the sample circular autocorrelation coefficients for a data set simulated from the model giving rise to the lower-right panel of Fig. 1. These coefficients were obtained using the circ.cor function of the CircStats R package (see Jammalamadaka and SenGupta, 2001). We note in closing that in general it is not easy to see how a particular covariance pattern observed on actual repeated measures Θi can be generated from a covariance pattern on the augmented data Yi . Further work is needed in this regard. 3. Inference via MCMC 3.1. Full conditional densities Let D = {(r11 , θ11 ), . . . , (rNnN , θNnN )} be a set of observations from the PCL model. From the discussion in Section 2.1, and omitting the superscript k for notational convenience, we can see that the full conditional densities for the parameters and latent variables of each of the components k ∈ {I , II } are given by
f (β|{bi }, , D) = Np
β|C
−1
N
Xit ei
,C ,
i=1 1 t ˜ f (bi |β, , D) = Nq (bi |D− i Zi ei , Di )
∀ i = 1, . . . , N
G. Nuñez-Antonio, E. Gutiérrez-Peña / Computational Statistics and Data Analysis 71 (2014) 506–519
511
and
f (|{bi }, D) = Wi v + N , B +
N
bi bti
,
i=1
where C =
N
Xit Xi + A,
i =1
ei = Yi − Zi bi , Di = Zit Zi + and e˜i = Yi − Xi βi . We note that the Rij are conditionally independent given the Θij . Thus, the full conditional densities of Rij are given, up to a constant of proportionality, by
f (rij |{θij }, β, bi ) ∝ rij exp −
1 2
[rij2 − 2(vijt µij )rij ]
.
Details concerning the properties of the conditional densities f (rij |·) and f (θij |·) can be found in Nuñez-Antonio (2010) and Nuñez-Antonio and Gutiérrez-Peña (2005). Clearly, sampling from f (β|{bi }, , D), f (bi |β, , D) and f (|{bi }, D) is not difficult as they are all standard distributions. On the other hand, we can sample each Rij separately from f (rij |θij , β, bi ) and in this way draw a random matrix R from f (R |{θij }, β, {bi }); this is carried out via a Metropolis–Hastings step. We can now use all the previous full conditionals in a Gibbs sampler to get a sample from the joint posterior density f (βI , βII , {bIi }, {bIIi }, I , II , R | {θij }).
(4)
3.2. Sampling strategies Direct implementation of the MCMC scheme described above will typically lead to a slow-mixing chain and potential convergence problems. This is due, in part, to the structure of the mixed-effects model within each component of the projected normal distribution and to the high dimension of the full hierarchical model. In the particular case of mixed-effects models for longitudinal scalar data, several methods have been proposed in the literature in order to improve the efficiency of MCMC methods. See, for example, Chib and Carlin (1999), Gelfand and Sahu (1999), Gelfand et al. (1995), Gilks and Roberts (1996) and Vines et al. (1996). Here, we adapt and employ a method proposed by Chib and Carlin (1999) to simulate the fixed effects β and all the random effects bi in a single block within each component of the PCL model. Specifically, our algorithm to sample from the posterior distribution of all the parameters of the PCL model is the following.
• For each component k ∈ {I , II }, 1. Sample βk and {bki } from f (βk , {bki }|I , II , D) = f (βk , {bki }|k , D) by sampling 1.1 βk from f (βk |I , II , D) = f (βk |k , D), 1.2 bki from f (bki |βI , βII , I , II , D) = f (bki |βk , k , D), for each i = 1, . . . , N . 2. Sample k from f (k |βI , βII , {bIi }, {bIIi }, D) = f (k |{bki }, D). • Sample Rij from f (rij |βI , βII , {bIi }, {bIIi }, I , II , {θij }), for each i = 1, . . . , N and each j = 1, . . . , ni . • Repeat until convergence is achieved. In 1.1 above, the conditional densities of βk , k ∈ {I , II }, are given by f (βk |k , D) = Npk (βk |µkβ , Λkβ ), where
µβ = (Λβ )
k −1
k
N
(
)(
Xik t
)
Vik −1 Yik
and
i=1
N k t k −1 k Λβ = A + (Xi ) (Vi ) Xi ; k
k
i =1
see Chib and Carlin (1999). This algorithm with blocking improves the mixing of the chain, and thus its convergence, especially when the blocking is applied separately to each component of the PCL model.
512
G. Nuñez-Antonio, E. Gutiérrez-Peña / Computational Statistics and Data Analysis 71 (2014) 506–519
Fig. 2. Longitudinal circular data for Example 1.
4. Examples We used the R language and environment (R Development Core Team, 2012) to simulate the data sets for Examples 1 and 2, and to carry out all of the analyses in this section. Example 1. For this illustration we simulated a longitudinal sample of size N = 65. This sample represents five repeated measurements on each of N = 65 individuals. The data were obtained using the following specification of the PCL model: YiI | βI ∼ N5 (XiI βI , I ), YiII | βII , {bIIi } ∼ N5 (XiII βII + ZiII bIIi , I ),
i = 1, . . . , 65,
where
βI = (1, − 4, 10, 0.5)t , βII = (2, 3, 10, 0.2)t and bIIi |Ω II ∼ N (0, σ 2 = 5), i = 1, . . . , 65. For this example we considered two covariates, X and X˜ . Values for these covariates were simulated from a Poisson (5) distribution and a Bernoulli (0.4) distribution, respectively. We built the matrices XiI and XiII as
XiI
=
XiII
1 1 = 1 1 1
Occasion 0 1 , 2 3 4
xi1 xi2 xi3 xi4 xi5
x˜ i1 x˜ i2 x˜ i3 x˜ i4 x˜ i5
i = 1, . . . , 65,
whereas ZiII was set to a vector of ones for all i = 1, . . . , 65. This structure of the design matrix in fact mimics the structure of the PCL model used to analyze the real data set of Example 3. Fig. 2 shows the corresponding data set. For the analysis of these data, we used a vague prior distribution with AI = 0, AII = 0, v II = 1 and BII = 0.001. The resulting marginal distribution for the variance component σ 2 , as well as the component-wise marginal distributions for the vectors βI and βII , are shown in Fig. 3. Note that the true value of each of the parameters is well inside the corresponding highest posterior density region. In particular, it can be seen that the proposed methodology yields appropriate inferences about the random effects. In this example, individual effects are described by a random intercept which is controlled by the parameter σ 2 . In order to properly assess the properties of our procedure, we simulated 100 further samples from the same model and fitted the PCL model to each of these samples. For each sample and each parameter in the model, a 95% credible interval was obtained. In particular, Fig. 4(a)–(c) show these intervals for the parameters β11 , β32 and σ 2 , respectively. In addition, the proportion of these intervals containing the true value for each parameter were computed; see Table 3. This analysis suggests that the inferences obtained for the parameters of the PCL model are appropriate. Example 2. A feature of the PCL model is its ability to separate within-subjects variance from between-subjects variance by means of random effects. In the PCL model, these random effects are given by the parameters bki , k ∈ {I , II }. In fact, the covariance matrices (k )−1 control the between-individual variation. Thus, making inferences on the elements of the matrices k is particularly relevant.
G. Nuñez-Antonio, E. Gutiérrez-Peña / Computational Statistics and Data Analysis 71 (2014) 506–519
513
Fig. 3. Posterior distributions of the parameters of the PCL model for Example 1.
Fig. 4. 95% credible intervals for each of 100 simulated samples from the PCL model of Example 1. The continuous line represents the actual value of the corresponding parameter.
514
G. Nuñez-Antonio, E. Gutiérrez-Peña / Computational Statistics and Data Analysis 71 (2014) 506–519 Table 3 Proportion of 95% credible intervals that contain the actual value of the parameter (Example 1).
βI
βII
β
β
β
93%
91%
91%
1 0
1 1
β
1 2
1 3
94%
β
2 0
96%
Ω II
β
2 1
92%
β
β
σ2
96%
94%
93%
2 2
2 3
Fig. 5. Longitudinal circular data from the PCL model (Example 2).
In this example, a longitudinal sample with two random effects was simulated. This sample contains five repeated measurements on each of N = 60 individuals. The data were generated using the following specification of the PCL model: YiI |βI ∼ N5 (XiI βI , I ), YiII |βII , {bi }II ∼ N5 (XiII βII + ZiII bIIi , I ),
i = 1, . . . , 60,
where
βI =
100 , −4
βII =
bIIi |II ∼ N2 (0, II ),
200 , −10
i = 1, . . . , 60,
with
( )
II −1
=
(σ12 )II
II σ12
II σ12
(σ22 )II
=
0.0001 0
0 5
and
XiI
=
XiII
=
ZiII
Occasion 0 1 , 2 3 4
1 1 = 1 1 1
i = 1, . . . , 60.
This particular instance of the PCL model has a rather complex structure. Note that the parameter (σ12 )II , which accounts for the presence of one of the random effects, is practically zero. Fig. 5 shows the corresponding data set. For the analysis of these data, we used a vague prior distribution with AI = 0 = AII = 0, v I = v II = 2 and BI = BII = Diag(0.001, 0.001). The resulting component-wise marginal distributions for the vectors βI and βII are presented in Fig. 6. Likewise, the marginal distributions for the components of the covariance matrix (ΩII )−1 are shown in Fig. 7. In addition, the 95% posterior credible intervals for the main parameters of the PCL model are presented in Table 4.
G. Nuñez-Antonio, E. Gutiérrez-Peña / Computational Statistics and Data Analysis 71 (2014) 506–519
515
Fig. 6. Posterior distributions of the elements of βI and βII for Example 2.
Fig. 7. Posterior distributions of the elements of (ΩII )−1 for Example 2.
The Bayesian analysis of the model seems to produce appropriate inferences for all the parameters. As in the previous example, the true value of each of the parameters is well inside the corresponding highest posterior density region. Moreover, we can make inferences about the random effects (via the variance–covariance parameters) despite the complex
516
G. Nuñez-Antonio, E. Gutiérrez-Peña / Computational Statistics and Data Analysis 71 (2014) 506–519 Table 4 95% posterior credible intervals for the parameters of the PCL model (Example 2).
βI
βII
(98.872, 119.515) (−8.404, −0.920)
(198.826, 240.078) (0.000, 0.445) (−18.399, −3.643) (2.358, 6.318) – (−0.681, 1.519)
–
(II )−1
Fig. 8. Longitudinal plot of escape directions (in degrees) for sandhoppers over five consecutive releases.
specification of the PCL model in this case. It must be pointed out, however, that implementation of the proposed methodology in situations such as this may be computationally expensive. Example 3. For this illustration we used a PCL model to analyze a real data set concerning the orientation of sandhoppers (Talitrus saltator) escaping towards the sea in order to avoid the risk of high dehydration. It is believed that sandhoppers will escape towards the sea, taking a course known as the theoretical escape direction. Borgioli et al. (1999) and D’Elia et al. (2001) reported a longitudinal study whose aim was to help understand the escaping mechanism of sandhoppers. In the original study, 144 sandhoppers were observed during two experiments, run during the Autumn of 1995 and the Spring of 1996. For this example, we were only able to obtain a subset of the data. This subset was previously analyzed by Song (2007) and is available at http://www.stats.uwaterloo.ca/~song/BOOKLDA.html. The data concern 65 sandhoppers which were released sequentially on five occasions. Each of their escape directions was recorded, along with certain covariates. The covariates included wind direction, azimuth direction for the sun (Sun), and eye measurements, which were used to construct an eye asymmetry index (Eye). The wind directions were split into four categories (OS for offshore, LSE for longshore-east, LSW for longshore-west and Onshore), with Onshore taken as the reference category. Fig. 8 shows the 65 short time series of angular responses, the escape directions. Our main objective, as in Borgioli et al. (1999), D’Elia et al. (2001), and Song (2007), is to examine which covariates significantly affect the escape direction of sandhoppers. D’Elia (2001), D’Elia et al. (2001) and Song (2007) used a generalized linear model approach and considered a von Mises distribution for the random component. However, none of them offered inferences for all the parameters involved. To analyze the sandhoppers data, here we consider the PCL model given by
µIij = β0I + β1I Sun + β2I Eye + β3I OS + β4I LSW + β5I LSE + β6I Time µ =β +β II ij
II 0
II 1 Sun
+β
II 2 Eye
+β
II 3 OS
+β
II 4 LSW
+β
II 5 LSE
+β
II 6 Time
(5)
+ b0i ,
i = 1, . . . , 65.
We used a vague prior distribution with AI = 0, AII = 0, v II = 1 and BII = 0.001. Figs. 9 and 10 show the posterior distributions of all the parameters for components I and II, respectively. In addition, Table 5 presents the corresponding 95% credible intervals for each component of the PCL model. This analysis suggests that {Sun} and {Eye, OS , LSW } are not relevant for µI and µII , respectively. These results are in agreement with previous analyses of these data. Specifically, the inclusion of the random effect is necessary since the variance parameter (σ 2 )II is significantly different from zero. In other words, there is a significant between-sandhopper variation and that variation may be explained by a random intercept, as suggested by the behavior observed in Fig. 8. The inclusion of a serial autocorrelation component in this particular example can be expected to improve the inference for other aspects of the analysis including prediction, missing data and model selection.
G. Nuñez-Antonio, E. Gutiérrez-Peña / Computational Statistics and Data Analysis 71 (2014) 506–519
517
Fig. 9. Posterior distributions for the parameters of component I (sandhoppers data).
Fig. 10. Posterior distributions for the parameters of component II (sandhoppers data).
5. Concluding remarks In this paper, we have introduced the PCL model, based on a projected normal distribution, to analyze short longitudinal series of circular data. Although the PCL model assumes a conditional independence structure on each of its components, it is quite flexible and can describe several distinct longitudinal patterns. It may also provide the basis for the analysis of (longer) time series of circular data. Furthermore, unlike currently available analyses of models for longitudinal circular data,
518
G. Nuñez-Antonio, E. Gutiérrez-Peña / Computational Statistics and Data Analysis 71 (2014) 506–519 Table 5 95% credible intervals for the parameters of the PCL model (sandhoppers data).
β0 β1 (Sun) β2 (Eye) β3 (OS) β4 (LSW ) β5 (LSE) β6 (Time) σ2
Component I
Component II
(−1.704, −0.280) (−0.007, 0.000) (0.511, 3.123) (−4.010, −1.649) (1.265, 2.253) (0.604, 1.678) (−0.226, −0.028)
(3.530, 6.445) (−0.033, −0.017) (−0.289, 6.144) (−0.584, 2.431) (−0.626, 1.557) (0.999, 3.489) (−0.342, −0.157) (0.982, 2.451)
–
our proposal can be implemented by means of a relatively simple Gibbs sampler and can produce inferences on a variety of quantities of interest, including those related to alternative parameterizations or with prediction. An extension of this work to the analysis of time series of circular data is currently being investigated. Acknowledgments The work of the first author was financed by Grant I0010-150526 of the Programa de Estancias Postdoctoral y Sabáticas al Extranjero para la Consolidación de Grupos de Investigación from CONACYT, Mexico. Support from the Department of Statistics of the University Carlos III of Madrid is also gratefully acknowledged. The work of the second author was partially supported by Sistema Nacional de Investigadores, Mexico. The authors are grateful to an associate editor and two anonymous referees whose comments greatly improved the presentation of this article. References Abe, T., Pewsey, A., 2011. Symmetric circular models through duplication and cosine perturbation. Computational Statistics and Data Analysis 55, 3271–3282. Arnold, B.C., SenGupta, A., 2006. Recent advances in the analyses of directional data in ecological and environmental sciences. Environmental and Ecological Statistics 13, 253–256. Artes, R., Jørgensen, B., 2000. Longitudinal data estimating equations for dispersion models. Scandinavian Journal of Statistics 27, 321–334. Artes, R., Paula, G.A., Ranvaud, R., 2000. Analysis of circular longitudinal data based on generalized estimating equations. Australian and New Zealand Journal of Statistics 42, 347–358. Borgioli, C., Martelli, M., Porri, F., D’Elia, A., Marcheti, G.M., Scapini, F., 1999. Orientation in talitrus saltator (montagu): trends in intrapopulations variability related to environmental and intrinsic factors. Journal of Experimental Marine Biology and Ecology 238, 29–47. Chib, S., Carlin, B.P., 1999. On MCMC sampling in hierarchical longitudinal models. Statistics and Computing 9, 17–26. D’Elia, A., 2001. A statistical model for orientation mechanism. Statistical Methods and Applications 10, 157–174. D’Elia, A., Borgioli, C., Scapini, F., 2001. Orientation of sandhoppers under conditions in repeated trials: an analysis using longitudinal directional data. Estuarine, Coastal and Shelf Science 56, 839–847. Diggle, P.J., Heagerty, P., Liang, K.-Y., Zeger, S.L., 2002. Analysis of Longitudinal Data, second ed. Oxford University Press, New York. Fisher, N.I., 1993. Statistical Analysis of Circular Data. Cambridge University Press. Fisher, N.I., Lewis, T., Embleton, B.J.J., 1987. Statistical Analysis of Spherical Data. Cambridge University Press. Fitzmaurice, G.M., Laird, N.M., Ware, J.H., 2004. Applied Longitudinal Analysis. Wiley, New York. Gelfand, A.E., Sahu, S.K., 1999. Identifiability, improper priors and Gibbs sampling for generalized linear models. Journal of the American Statistical Association 94 (445), 247–253. Gelfand, A.E., Sahu, S.K., Carlin, B.P., 1995. Efficient parameterizations for normal linear mixed models. Biometrika 82, 479–488. Gelman, A., Hill, J., 2007. Data Analysis Using Regression and Multilevel/Hierarchical Models. Oxford University Press, New York. Gilks, W.R., Roberts, G.O., 1996. Strategies for improving MCMC. In: Gilks, W.R., Richardson, S., Spiegelhalter, D.J. (Eds.), Markov Chain Monte Carlo in Practice. Chapman and Hall, London, pp. 89–114. Hartley, A.M., Naik, D.N., 2001. Estimation of familial correlation under autoregressive circular covariance. Communications in Statistics. Theory and Methods 30, 1811–1828. Hedeker, D., Gibbons, R.D., 2006. Longitudinal Data Analysis. John Wiley and Sons, Inc., New Jersey. Jammalamadaka, S.R., SenGupta, A., 2001. Topics in Circular Statistics. World Scientific, Singapore. Jørgensen, B., 1997a. Proper dispersion models (with discussion). Brazilian Journal of Probability and Statistics 11, 89–140. Jørgensen, B., 1997b. The Theory of Dispersion Models. Chapman and Hall, London. Laird, N.M., Ware, J.H., 1982. Random-effects models for longitudinal data. Biometrics 38, 963–974. Liang, K-Y., Zeger, S.L., 1986. Longitudinal analysis using generalized linear models. Biometrika 73, 13–22. Mardia, K.V., Jupp, P.E., 2000. Directional Statistics. Wiley, Chichester. Nuñez-Antonio, G., 2010. Análisis Bayesiano de Modelos Lineales para Datos Direccionales. Unpublished Ph.D. Thesis, Autonomous Metropolitan University, Mexico City, Mexico (in Spanish). Nuñez-Antonio, G., Gutiérrez-Peña, E., 2005. A Bayesian analysis of directional data using the projected normal distribution. Journal of Applied Statistics 32, 995–1001. Nuñez-Antonio, G., Gutiérrez-Peña, E., Escarela, G., 2011. A Bayesian regression model for circular data based on the projected normal distribution. Statistical Modelling 11, 185–201. Oliveira, M., Crujeiras, R.M., Rodríguez-Casal, A., 2012. A plug-in rule for bandwidth selection in circular density estimation. Computational Statistics and Data Analysis 56, 3898–3908. Pewsey, A., 2008. The wrapped stable family of distributions as a flexible model for circular data. Computational Statistics and Data Analysis 52, 1516–1523. Presnell, B., Morrison, S.P., Littell, R.C., 1998. Projected multivariate linear model for directional data. Journal of the American Statistical Association 93, 1068–1077. Presnell, B., Rumcheva, P., 2008. The mean resultant length of the spherically projected normal distribution. Statistics and Probability Letters 78, 557–563.
G. Nuñez-Antonio, E. Gutiérrez-Peña / Computational Statistics and Data Analysis 71 (2014) 506–519
519
Qin, X., Zhang, J-S., Yan, X-D., 2011. A nonparametric circular-linear multivariate regression model with a rule-of-thumb bandwidth selector. Computers and Mathematics with Applications 62, 3048–3055. R Development Core Team,, 2012. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, ISBN: 3-900051-07-0, http://www.R-project.org. Rueda, C., Fernandez, M.A., Peddada, S.D., 2009. Estimation of parameters subject to order restrictions on a circle with application to estimation of phase angles of cell-cycle genes. Journal of the American Statistical Association 104, 338–347. Simpson, S.L., Edwards, L.J., 2011. A circular LEAR correlation structure for cyclical longitudinal data. Statistical Methods in Medical Research http://dx.doi.org/10.1177/0962280210395741. Song, X-K.P., 2007. Correlated Data Analysis: Modeling Analytics, and Applications. Springer, New York. Vines, S.K., Gilks, W.R., Wild, P., 1996. Fitting Bayesian multiple random effects models. Statistics and Computing 6, 337–346.