Bayesian local influence of semiparametric structural equation models

Bayesian local influence of semiparametric structural equation models

Computational Statistics and Data Analysis xx (xxxx) xxx–xxx Contents lists available at ScienceDirect Computational Statistics and Data Analysis jo...

722KB Sizes 1 Downloads 73 Views

Computational Statistics and Data Analysis xx (xxxx) xxx–xxx

Contents lists available at ScienceDirect

Computational Statistics and Data Analysis journal homepage: www.elsevier.com/locate/csda

Q1

Q2

Bayesian local influence of semiparametric structural equation models Ming Ouyang a , Xiaodong Yan b , Ji Chen c , Niansheng Tang b , Xinyuan Song a,∗ a

Shenzhen Research Institute & Department of Statistics, Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, China

b

School of Statistics and Mathematics, Yunnan University, Kunming, Yunnan, China

c

Division of Biostatistics, Sager Institute, London, UK

article

info

Article history: Received 11 May 2016 Received in revised form 22 January 2017 Accepted 26 January 2017 Available online xxxx Keywords: Bayesian local influence Latent variables Markov chain Monte Carlo methods Perturbation schemes Semiparametric modeling

abstract The authors develop a Bayesian local influence method for semiparametric structural equation models. The effects of minor perturbations to individual observations, the prior distributions of parameters, and the sampling distribution on the statistical inference are assessed with various perturbation schemes. A Bayesian perturbation manifold is constructed to characterize such perturbation schemes. The first- and second-order influence measures are proposed to quantify the degree of minor perturbations on different aspects of a statistical model via objective functions, such as Bayes factor. Simulation studies are conducted to evaluate the empirical performance of the Bayesian local influence procedure. An application to a study of bone mineral density is presented. © 2017 Elsevier B.V. All rights reserved.

1. Introduction Latent variables that should be examined with multiple observed variables are common in behavioral, social, psychological, and medical studies. Structural equation models (SEMs) are well recognized as the most important statistical tool for assessing the interrelationships among latent variables. In recent years, SEMs have been constantly generalized in many directions to cope with complex situations in reality. One important direction is the development of semiparametric SEMs, which generalize a linear structural equation to a nonparametric one with unspecified smoothing functions to capture the nonlinear relationships among latent variables. However, classical methods in SEMs cannot handle such semiparametric models. Thus, Bayesian approach has become a dominant tool used for recent semiparametric or nonparametric SEM development. In a substantive study, the erroneous results of a statistical inference can be caused by several improper model inputs, such as data and model specification. Since the pioneering work of Cook (1986), local influence analysis has received considerable attention as an important statistical inference beyond estimation, and it has already been applied to a large number of statistical models including SEMs (e.g., Zhu and Lee, 2001; Lee and Tang, 2004; Song and Lee, 2004). However, the previous works were mainly developed in the maximum likelihood (ML) framework. Compared with the rapid development of the local influence in ML framework, the results of Bayesian local influence analysis are limited. McCulloch (1989) was among the very first to study Bayesian local influence analysis, and the study extended the local influence approach to assess the effect of perturbation to prior distribution of Bayesian analysis. Zhu et al. (2011) recently considered a Bayesian



Corresponding author. Fax: +852 26035188. E-mail address: [email protected] (X. Song). URL: http://www.sta.cuhk.edu.hk/xysong/ (X. Song).

http://dx.doi.org/10.1016/j.csda.2017.01.007 0167-9473/© 2017 Elsevier B.V. All rights reserved.

1

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

2

M. Ouyang et al. / Computational Statistics and Data Analysis xx (xxxx) xxx–xxx

24

geometric approach to simultaneously assess outliers and/or influential observations, perform model comparison, and conduct sensitivity analysis. Chen et al. (2013) proposed a local influence procedure for Bayesian transformation SEMs. Tang and Duan (2014) conducted a Bayesian influence analysis for generalized partial linear mixed models. Nevertheless, the aforementioned studies focused on either parametric SEMs or semiparametric models without unknown functions of latent variables. Thus, their findings are not directly applicable to the present study. In this article, we develop a Bayesian local influence procedure in the context of a semiparametric SEM for the following reasons. First, the ML-based local influence analysis for a semiparametric SEM with unknown functions of latent variables is infeasible because of high computational complexity. Second, Bayesian local influence enables us to examine the proposed model structure and perform sensitivity analysis along with outlier detection via a simultaneous perturbation to the data, prior distributions, and the sampling distribution. We believe that no study has ever focused on the Bayesian local influence analysis of the semiparametric SEM. Notably, major differences exist between this study and the existing literature. The proposed model accommodates the functional effects of latent variables and their pairwise interactions, whereas the models considered in the preceding studies are unable to assess such kinds of effects. The unknown scales of latent variables raise additional challenges in estimation of their nonparametric functions, leading to complicated prior specification, posterior sampling, and computation of the associated local influence measures. Finally, the proposed method provides new insights into the effects of latent variables and simultaneously examines their importance through appropriate perturbations on the approximations of the nonparametric functions. The remainder of the article is organized as follows. Section 2 defines the semiparametric SEM. The Bayesian P-splines approach used in estimating univariate and bivariate nonlinear functions is also described. Section 3 introduces Bayesian perturbation model and manifold, the first- and second-order local influence measures based on the objective function of Bayes factor, and the associated posterior computation. Section 4 presents the simulation studies to assess the empirical performance of the proposed method. Section 5 reports an application to the study of bone mineral density (BMD). Section 6 concludes the paper.

25

2. Semiparametric SEM

26

2.1. Model description

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

27 28 29 30 31 32 33 34 35 36

37

38 39 40 41

42

43 44 45 46 47 48 49

Let yi be a p × 1 random vector of observed variables. A measurement equation is defined as follows: yi = Axi + 3ϖ i + ϵi ,

where xi is a vector of fixed covariates, A is a matrix of coefficients, ϖ i = (ϖi1 , . . . , ϖiq ) is a q × 1 vector of latent variables with q < p, 3 is a p × q factor loading matrix, and ϵi is a p × 1 vector of residual errors independent of ϖ i and distributed T as N [0, 9] with a diagonal covariance matrix 9. We partition ϖ i into (ηTi , ξ i )T to examine the interrelationships among latent variables, where ηi (q1 × 1) and ξ i (q2 × 1) denote the outcome and explanatory latent vectors, respectively, and ξ i is assumed distributed as N [0, 8∗ ] with a covariance matrix 8∗ . A nonparametric structural equation used to assess the functional effects of the explanatory observable and latent variables on the outcome latent variables is defined as follows. For an arbitrary element ηih in ηi , i = 1, 2, . . . , n, h = 1, 2, . . . , q1 ,

ηih = gh1 (zi1 ) + · · · + ghD (ziD ) + fh1 (ξi1 ) + · · · + fhq2 (ξiq2 ) +

 u
hhuv (ξiu , ξiv ) + δih ,

(2)

where zi1 , . . . , ziD are fixed covariates, gh1 (·), . . . , ghD (·), fh1 (·), . . . , fhq2 (·); and hhuv (·, ·) are unspecified univariate and bivariate smooth functions, and δih is the residual error distributed as N [0, ψδh ] and independent of ξ i and δih′ for h ̸= h′ . For notational simplification, we suppress the subscript h in (2) by assuming q1 = 1. An extension to the case with q1 > 1 is straightforward. Then, model (2) is simplified to

ηi = g1 (zi1 ) + · · · + gD (ziD ) + f1 (ξi1 ) + · · · + fq2 (ξiq2 ) +

 u
huv (ξiu , ξiv ) + δi .

(3)

The semiparametric SEM defined with (1) and (3) not only releases the linear assumption on the relationships between explanatory and outcome latent variables but also accommodates the pairwise functional interactions of explanatory latent variables. The proposed model is not identifiable without imposing the appropriate identifiability constraints. A simple and common method involves fixing some elements of 3 at preassigned values to identify the measurement equation (Bollen, 1989; Lee, 2007). We need to identify all the univariate and bivariate unknown functions to identify the structural equation. Based on the idea of Song et al. (2014), we impose the identifiability constraints for gd (·), fj (·), and huv (·, ·) as follows:

 50

(1) T

g¯d =

max(z1d ,...,znd ) min(z1d ,...,znd )

gd (z )dz /range(z1d , . . . , znd ) = 0,

(4)

M. Ouyang et al. / Computational Statistics and Data Analysis xx (xxxx) xxx–xxx

f¯j =



3



fj (ξ )p(ξ )dξ = 0,

(5)

1

−∞

h¯ uv (ξv ) = h¯ uv (ξu ) =





−∞  ∞

huv (ξu , ξv )p(ξu )dξu = 0,

for all values of ξv ,

(6)

2

huv (ξu , ξv )p(ξv )dξv = 0,

for all values of ξu ,

(7)

3

−∞

where p(ξi ) is the prior distribution of ξi . Constraints (4)–(7) ensure that gd (·), fj (·), and huv (·, ·) are identifiable and orthogonal to one another.

4 5

2.2. Modeling of nonparametric functions

6

The unknown smooth functions gd (·), fj (·), and huv (·, ·) in model (3) are unspecified, and an estimation based on the data is required. The Bayesian P-splines approach (Lang and Brezger, 2004) can be used to approximate the unknown functions of covariates and latent variables. For covariates zid , we can approximate gd (·) with the traditional B-splines approximation

Kdz

7 8 9

(·), where is the number of splines determined by the number of knots, γdk is an unknown coefficient, and k=1 γ Bzdk (·) is a B-splines basis function of appropriate order. The traditional B-splines approximation of the unknown functions of latent variables cannot be used because latent variables are unobservable, and their observations must be updated in Markov chain Monte Carlo (MCMC) iterations. Consequently, predefining the finite ranges of latent variables and determining the positions of the knots beforehand are impossible. We propose the use of probit transformation method introduced by Song et al. (2014) to address the problem. We let Φ (·) be the cumulative distribution of N (0, 1). The unknown smooth function z dk Bdk

Kdz

10 11 12 13 14 15

Kj

of latent variable fj (·) is modeled by k=1 βjk Bjk (Φ (·)), where Bjk (Φ (·)) is the composite functions of B-splines basis Bjk (·) and probit transformation Φ (·). For bivariate nonparametric functions of latent variables, we consider a tensor product S T B-splines approximation s=1 t =1 buv st Uus (Φ (ξiu ))Vv t (Φ (ξiv )), where Uus (Φ (·)) and Vv t (Φ (·)) are likewise the composite functions of B-splines basis and probit transformation. With B-splines approximation, model (3) can be rewritten as follows: z

ηi =

Kd D   d=1 k=1

γdk Bzdk (zid ) +

Kj q2  

βjk Bjk (Φ (ξij )) +

j =1 k =1

q2  q2  S  T 

buv st Uus (Φ (ξiu ))Vv t (Φ (ξiv )) + δi .

16 17 18 19

(8)

20

u=1 v=1 s=1 t =1

In practice, a common choice of the B-splines basis function is cubic B-splines. We introduce various penalties to the coefficients of B-splines basis functions in Eq. (8) (see Section 3.3) to control the smoothness of the approximations and to prevent overfitting.

21 22 23

3. Bayesian local influence analysis

24

3.1. Bayesian perturbation model and manifold

25

In this section, we introduce Bayesian perturbations model. Let ω = (ωTy , ωTp , ωTs )T be the perturbation mapping from the infinite set , where ωy , ωp , and ωs correspond to perturbations to the data, prior distributions of parameters, and the sampling distribution, respectively. Let p(y , ϖ, θ) be the joint probability density in a Bayesian model. We introduce perturbation ω into the density function denoted by p(y , ϖ, θ|ω). Thus, for ω which varies in , the Bayesian perturbation model M can be defined as a family of probability density p(y , ϖ, θ|ω ∈ ). Assume that p(y , ϖ, θ|ω) has a common dominating measure for all ω ∈ , and a central point ω0 of  exists representing the case of no perturbation, such that p(y , ϖ, θ|ω0 ) = p(y , ϖ, θ). We consider the Bayesian perturbation manifold based on Bayesian perturbation model M to measure each perturbation ω. For a given p(y , ϖ, θ|ω) ∈ M, we let C (t ) = p(y , ϖ, θ|ω(t )) be a smooth curve through the space of M with open interval domains containing 0 with ω(0) = ω0 , and p(y , ϖ, θ|ω(0)) = p(y , ϖ, θ). We let ℓ(y , ϖ, θ|ω(t )) = log p(y , ϖ, θ|ω(t )) = ˙ y , ϖ, θ|ω(t )) = d log C (t )/dt exists log C (t ). We assume that C (t ) is sufficiently smooth such that the tangent vector ℓ( 2 ˙ with Ey ,ϖ,θ (ℓ(y , ϖ, θ|ω(t )) ) < ∞. For all possible smooth C (t ), we form the tangent space Tω M based on the tangent  ˙ y , ϖ, θ|ω(0))C (0)dy ϖ dθ = 0. We let ωj be the jth element ˙ y , ϖ, θ|ω(0)) = d log C (t )/dt |t =0 , which satisfies ℓ( vector ℓ( of ω, and ∂ωj = ∂/∂ωj be the partial derivative with respect to ωj . The inner product of tangent vectors ∂ωi ℓ(y , ϖ, θ|ω) and ∂ωj ℓ(y , ϖ, θ|ω) is

  gij (ω) = Ey ,ϖ,θ ∂ωi ℓ(y , ϖ, θ|ω)∂ωj ℓ(y , ϖ, θ|ω) ,

(9)

where Ey ,ϖ,θ [·] represents the expectation with respect to p(y , ϖ, θ|ω). The metric matrix G(ω) = (gij (ω)) is an expected Fisher information matrix with respect to the perturbation ω, in which gij (ω) indicates the association between ωi and ωj , and gjj (ω) indicates the amount of perturbation introduced by ωj . In accordance to a common practice in the literature (Zhu

26 27 28 29

Q3

30 31 32 33 34 35 36 37 38 39 40

41

42 43 44

4

1 2 3 4 5 6

7

M. Ouyang et al. / Computational Statistics and Data Analysis xx (xxxx) xxx–xxx

et al., 2011; Chen et al., 2013), in this study, we consider the perturbations that are orthogonal to one another so that G(ω(0)) is a diagonal matrix. Next, we consider an appropriate objective function f (ω, ω0 ), which assesses the sensitivity of the interested inference on the perturbation under consideration. The objective function is chosen as f (ω, ω0 ) = f {p(θ|y , ϖ, ω), p(θ|y , ϖ, ω0 )}, which measures the deviance between the perturbed and unperturbed posterior distributions of θ given y and ϖ . Then, the influence measure for comparing p(θ|y , ϖ, ω) and p(θ|y , ϖ, ω0 ) is defined as IM f (ω,ω0 ) =

f (ω, ω0 )2 d(ω, ω0 )2

,

(10)

8

where d(ω, ω0 ) denotes the minimal geodesic distance between p(y , ϖ, θ|ω) and p(y , ϖ, θ|ω0 ) (Zhu et al., 2011).

9

3.2. Local influence measures

10 11 12 13

14

Given that all possible smooth curves C (t ) pass through ω0 = ω(0), the local behavior of f (ω, ω0 ) = f (ω(t ), ω(0)) can be measured with limt →0 IM f (ω(t ),ω(0)) . Taylor’s expansion of f (ω(t ), ω(0)) at 0 is equal to f (ω(0), ω(0)) + f˙ (ω(0)) + 1¨ f (ω(0))t 2 + o(t 2 ), where f˙ (ω(0)) and f¨ (ω(0)) denote the first- and second-order derivatives of f (ω(t ), ω(0)) with respect 2 to t evaluated at 0. The first-order local influence measure is defined as FI (ω(0)) = lim IM f (ω(t ),ω(0)) = lim t →0

15 16 17

t →0

f (ω(t ), ω(0))2 d(ω(t ), ω(0))

2

=

[v T ∂ω f (ω(0))]2 , v T G(ω(0))v

where v = dω(t )/dt |t =0 , ∂ω f (ω(0)) denotes the first-order derivative of f (ω(t ), ω(0)) with respect to ω(t ) evaluated at t = 0, and G(ω(0)) is the Fisher information matrix formed by gij in (9) at t = 0. Based on the Cauchy–Schwarz inequality and Chen et al. (2013), we obtain vf = arg max FI (ω(0)) = [G(ω(0))]−1 [∂ω f (ω(0))],

18

19 20

21

v

23

(12)

FI (vf ) = [G(ω(0))]−1 [∂ω f (ω(0))]2 , where FI (vf ) is a vector of the maximum first-order local inference measures of all components. If FI (ω(0)) = 0, then the second-order local influence measure SI can be used. SI is defined with SI (ω(0)) = lim IM f (ω(t ),ω(0)) = lim t →0

22

(11)

t →0

2f (ω(t ), ω(0)) d(ω(t ), ω(0))

2

=

v T ∂ω2 f (ω(0))v v T G(ω(0))v

,

(13)

where ∂ω2 f (ω(0)) denotes the second-order derivatives of f (ω(t ), ω(0)) with respect to ω(t ) evaluate at ω(0). Based on a derivation similar to Chen et al. (2013), the following is obtained: vs = arg max SI (ω(0)),

24

v



25 26 27 28 29 30 31



where vs is the eigenvector of [G(ω(0))]−1 ∂ω2 f (ω(0)) corresponding to the largest eigenvalue, and SI (vs ) is a vector of the maximum second-order local influence measures of all components. With regard to the choice of the objective function, Ibrahim et al. (2011) introduced several candidates, such as Bayes factor, φ -divergence, and posterior mean distance. In the present study, we focus on Bayes factor because arg maxv FI (ω(0)) ̸= 0 when the Bayes factor was considered as the objective function. Consequently, the computation of the second-order influence measure is unnecessary. We consider the logarithm of Bayes factor as follows: f (ω(t ), ω(0)) = BF (ω(t ), ω(0)) = log p(y |ω(t )) − log p(y |ω(0))

 32

33

34

(14)

SI (vs ) = diag [G(ω(0))]−1 ∂ω2 f (ω(0)) ,

= log

p(y , ϖ, θ|ω(t )) dϖ dθ − log



p(y , ϖ, θ|ω(0)) dϖ dθ.

By assuming the legitimacy of integration interchange with differentiation, we can derive the following equality:

  ∂ω f (ω(0)) = ∂ f (ω(t ), ω(0))/∂ω|t =0 = Eϖ,θ [∂ω ℓ(y , ϖ, θ|ω(t ))] 

t =0

35

where Eϖ,θ [·] denotes the expectation with respect to p(ϖ, θ|y , ω(t )).

36

3.3. Prior distribution and posterior computation

37

(15)

,

(16)

An important issue in the implementation of a full Bayesian analysis is to specify the prior distribution of the unknown parameters. First, we discuss the regression coefficients in Eq. (8). For the univariate functions of covariates and latent

M. Ouyang et al. / Computational Statistics and Data Analysis xx (xxxx) xxx–xxx

5

variables, random walk priors coped with the constrained Gaussian density are assigned to γ d = (γd1 , . . . , γdK z ) and d βj = (βj1 , . . . , βjKj ) as follows: p(γ d |τγ d ) = p(βj |τβj ) =

τ 

K ∗ /2

γd

d





τβj

 τ γd

exp −

Kj∗ /2



 exp −

2

τβj 2

1 2

 γ⊤ d Mγ d γ d ,

(17)

3

 β β⊤ M βj j , j

(18)

4

where Kd∗ = rank(Mγ d ), Kj∗ = rank(Mβj ); Mγ d and Mβj are penalty matrices based on the random walk priors (Song and Lu, 2010); and τγ d and τβj are inverse smoothing parameters for controlling the amount of penalty. For the regression coefficients associated with the interactions terms, we set the following prior distributions to avoid over smoothing the bivariate functions of latent variables (Song et al., 2014):

5 6 7 8



  S  T  ϑ1st τ1 ϑ1st τ1 (buvst − buvs(t −1) )2 p(buv |τ1 , τ2 ) = exp − 2π 2 s =1 t =2    T  S  ϑ2st τ2 (buvst − buvs(t −1) )2 ϑ2st τ2 × exp − , 2π 2 t =1 s=2

9

(19)

10

where τ1 and τ2 follow the gamma distribution Gamma(αbuv , βbuv ), and ϑlkr s follow the uniform distribution U (0, 1). Second, we specify the prior distributions for the structural parameters, such as A, 3, 9, ψδ , and 8∗ . The following conjugate prior distributions (Lee, 2007; Song and Lee, 2012) are considered. For j = 1, . . . , p,

12

D

p(Aj ) = N [Aj0 , 6Aj0 ],

D

D

p(8

13

p(3j ) = N [3j0 , ψj 63j0 ], D

p(ψj−1 ) = Gamma[αj0 , βj0 ], ∗−1

11

p(ψδ−1 ) = Gamma[αδ 0 , βδ 0 ],

(20)

14

D

) = Wishart [R0 , ρ0 ],

D

where ‘p(·) =’ denotes ‘‘the distribution p(·) is equal to’’; ATj and 3Tj denote the jth row of Aj and 3, respectively; ψj is the jth diagonal element of 9; Aj0 , 3j0 , αj0 , βj0 , αδ 0 , βδ 0 , and ρ0 , as well as positive definite matrices 6Aj0 , 63j0 and R0 , are hyperparameters with preassigned values. The expectation Eϖ,θ (·) in (9) and (16) should be calculated to obtain the local influence measures. However, this expectation is intractable because it involves high-dimensional integration. Thus, we employ MCMC methods, such as Gibbs sampler and Metropolis–Hastings algorithm, to conduct the Bayesian estimation and the local influence analysis. 3.4. Perturbation schemes

16 17 18 19 20

21

In this section, we discuss the computation of the first-order local influence. The main task is to compute the G(ω(0)) involved in the Fisher information matrix (9) and the ∂ω ℓ(y , ϖ, θ|ω(t ))|t =0 involved in the computation of Bayes factor in (16). Given that ωy (t ), ωs (t ), and ωp (t ) denote the perturbation to the data, sampling distributions, and prior distributions of parameters, respectively, they are independent of one another. Therefore, we have ℓ(y , ϖ, θ|ω(t )) = ℓ(y |ϖ, θ, ωy (t )) + ℓ(ϖ|θ, ωs (t )) + ℓ(θ|ωp (t )), and we can separately discuss G(ω(0)) and ∂ω ℓ(y , ϖ, θ|ω(t ))|t =0 that are related to the data, prior distributions of parameters, or sampling distributions only. First, we discuss the perturbation scheme to the data. We let ωy (t ) = (ωy1 (t ), . . . , ωyn (t ))T and ωy (0) = (1, . . . , 1)T . The perturbation scheme to the data likelihood is presented by

22 23 24 25 26 27 28 29

p

   2π ψj 1  , ωyi (t )(yij − ATj xi − 3Tj ϖ i )2 /ψj + log ℓ(y |ϖ, θ, ωy (t )) = − 2 i=1 j=1 ωyi n

∂ωyi (t ) ℓ(yi |ϖ i , θ, ωy (t ))|t =0 = − G(ωy (0)) =

15

p 1 

2 j =1

 (yij − ATj xi − 3Tj ϖ i )2 /ψj − 1 ,

30

31

p

I. 2 We can diagnose the overall effect of the outliers or influential observations on the model by directly perturbing the data likelihood function. Second, we discuss the perturbation schemes corresponding to the prior distributions of model parameters to assess the sensitivity of Bayesian results to prior inputs. We consider the structural parameters of primary interest, including regression-type parameters A and 3, as well as variance/covariance parameters 9 and ψδ . The perturbation schemes are presented as follows:

32

33 34 35 36 37 38

6

1

2

3

4

M. Ouyang et al. / Computational Statistics and Data Analysis xx (xxxx) xxx–xxx

Perturbation of Aj : We let ωA (t ) denote a perturbation and ωA (0) = 1, then

ℓ(Aj |Aj0 , 6Aj0 , ωA (t )) = −

1 2

 pAj log

∂ωA (t ) ℓ(A|Aj0 , 6Aj0 , ωA (t ))|t =0 = GA (ωA (0)) =

1 2

1 2





 + log |6Aj0 | + ωA (t )(Aj − Aj0 )⊤ (6Aj0 )−1 (Aj − Aj0 ) ,

ωA (t ) p   pA − (Aj − Aj0 )⊤ (6Aj0 )−1 (Aj − Aj0 ) , j =1

pA ,

p

5

where pAj denotes the dimension of Aj and pA =

6

Perturbation of 3j : We let ωΛ (t ) denote a perturbation and ωΛ (0) = 1, then

7

8

9

ℓ(3j |3j0 , 63j0 , ωΛ (t )) = −

1 2

 p3j log

∂ωΛ (t ) ℓ(3|3j0 , 63j0 , ωΛ (t ))|t =0 = G3 (ωΛ (0)) =

1 2

1 2

j =1





pAj .

 + log |63j0 | + ωΛ (t )(3j − 3j0 )T (63j0 )−1 (3j − 3j0 ) ,

ωΛ (t ) p   (3j − 3j0 )⊤ (63j0 )−1 (3j − 3j0 ) , p3 − j =1

p3 ,

10

where p3j denotes the number of parameters of 3j excluding the fixed elements and p3 =

11

Perturbation of 9 = diag(ψj ): We let ω9 (t ) denote a perturbation and ω9 (t ) = 1, then

12

13

p

j =1

p3j .

ℓ(ψj−1 |αj0 , βj0 , ω9 (t )) = αj0 log(ω9 (t )βj0 ) − log(Γ (αj0 )) + (αj0 − 1) log(ψj−1 ) − ω9 (t )βj0 ψj−1 , p 

∂ω9 (t ) ℓ(9−1 |αj0 , βj0 , ω9 (t ))|t =0 =

[αj0 − βj0 ψj−1 ],

j =1

14

Gω9 (0) (ωΛ (0)) =

p 

αj0 ,

j =1 15

where Γ (·) is the gamma function.

16

Perturbation of ψδ : We let ωψδ (t ) denote a perturbation and ωψδ (t ) = 1, then

17

ℓ(ψδ−1 |αδ0 , βδ0 , ωψδ (t )) = αδ0 log (ωψδ (t )βδ0 ) − log(Γ (αδ0 )) + (αδ0 − 1) log(ψδ−1 ) − ωψδ (t )βδ0 ψδ−1 ,

18

∂ωψδ (t ) ℓ(ψδ−1 |αδ0 , βδ0 , ωψδ (t ))|t =0 = αδ0 − βδ0 ψδ−1 ,

19 20 21 22 23 24

Gωψ (0) (ωΛ (0)) = αδ 0 . δ

Finally, we discuss the perturbation ωs (t ) corresponding to the sampling distribution to diagnose whether each term of Eq. (8) is important in explaining the variation of the outcome latent variable. For the purpose of illustration, we take the following simple structural equation as an example:

ηi = g (zi ) + f1 (ξi1 ) + f2 (ξi2 ) + h12 (ξi1 , ξi2 ) + δi . Perturbation of covariates zi : We let ωs1 (t ) denote a perturbation and ωs1 (0) = 0, then n 1

25

ℓ(η|ξ, θ, ωs1 (t )) = −

26

∂ωs1 (t ) ℓ(η|ξ, θ, ωs1 (t ))|t =0 =

27

Gzi p (ωs1 (0)) =

n 

 Eϖ,θ

i=1 28

2 i=1

{(ηi − ωs1 (t )g (zi ) − f1 (ξi1 ) − f2 (ξi2 ) − h12 (ξi1 , ξi2 ))2 /ψδ + log(2π ψδ )},

1

ψδ

n 1 

ψδ

g (zi )(ηi − f1 (ξi1 ) − f2 (ξi2 ) − h12 (ξi1 , ξi2 )),

i =1



g (zi )2 .

Perturbation of latent variable ξi1 : We let ωs2 (t ) denote a perturbation and ωs2 (0) = 0, then n 1

29

ℓ(η|ξ, θ, ωs2 (t )) = −

30

∂ωs2 (t ) ℓ(η|ξ, θ, ωs2 (t ))|t =0 =

2 i=1

{(ηi − g (zi ) − ωs2 (t )[f1 (ξi1 ) + h12 (ξi1 , ξi2 )] − f2 (ξi2 ))2 /ψδ + log(2π ψδ )}, n 1 

ψδ

(f1 (ξi1 ) + h12 (ξi1 , ξi2 ))(ηi − g (zi ) − f2 (ξi2 )),

i =1

(21)

M. Ouyang et al. / Computational Statistics and Data Analysis xx (xxxx) xxx–xxx

Gξi1 (ωs2 (0)) =

n 

 Eϖ,θ

1

ψδ

i=1

7

 (f1 (ξi1 ) + h12 (ξi1 , ξi2 ))2 .

1

Perturbation of latent variable ξi2 : We let ωs3 (t ) denote a perturbation and ωs3 (0) = 0, then

2

n

ℓ(η|ξ, θ, ωs3 (t )) = −

1 2 i =1

{(ηi − g (zi ) − f1 (ξi1 ) − ωs3 (t )[f2 (ξi2 ) + h12 (ξi1 , ξi2 )])2 /ψδ + log(2π ψδ )}, n 1 

∂ωs3 (t ) ℓ(η|ξ, θ, ωs3 (t ))|t =0 = Gξi2 (ωs3 (0)) =

n 

 Eϖ,θ

1

ψδ

i=1

3

ψδ

(f2 (ξi2 ) + h12 (ξi1 , ξi2 ))(ηi − g (zi ) − f1 (ξi1 )),

4

i=1

(f2 (ξi2 ) + h12 (ξi1 , ξi2 ))

2



.

5

Perturbation of interaction h12 (ξi1 , ξi2 ): We let ωs4 (t ) denote a perturbation and ωs4 (0) = 0, then

6

n 1 ℓ(η|ξ, θ, ωs4 (t )) = − {(ηi − g (zi ) − f1 (ξi1 ) − f2 (ξi2 ) − ωs4 (t )h12 (ξi1 , ξi2 ))2 /ψδ + log(2π ψδ )},

7

2 i =1

n 1 

∂ωs4 (t ) ℓ(η|ξ, θ, ωs4 (t ))|t =0 = G(ξi1 ,ξi2 ) (ωs4 (0)) =

n 

ψδ

 Eϖ,θ

i =1

1

ψδ

h12 (ξi1 , ξi2 )(ηi − g (zi ) − f1 (ξi1 ) − f2 (ξi2 )),

8

i=1



h12 (ξi1 , ξi2 )2 .

9

For a specific perturbation scheme and an objective function, we use the following steps to implement the Bayesian influence analysis: Step 1. A Bayesian perturbation model p(y , ϖ, θ|ω) is constructed, and the objective function f (ω(t ), ω(0)) is chosen. Step 2. ∂ω f (ω(0)), G(ω(0)) = (∂ω2 j ωk ℓ(y , ϖ, θ|ω(0))), and the first-order local influence measure arg maxv FIf (ω(0)) = FI [vF ] are calculated. Step 3. kmeans method (Forgy, 1965) is applied to cluster FI [vF ] into a major class and a minor class, where the former includes the majority of cases with small values of FI [vF ], whereas the latter includes the minority of cases with very large values of FI [vF ]. The percentage of cases in the minor class exceeding 5% may indicate the nonexistence of outliers or influential observations; otherwise, the minor class provides suspicious influential cases. Step 4. The parameter estimates are checked by deleting or modifying suspicious cases. If they significantly differ before and after the modification, the suspicious cases are diagnosed as influential, and the procedure returns to Step 1 for further diagnosis; otherwise, the procedure ends.

10 11 12 13 14 15 16 17 18 19 20 21

4. Simulation study

22

4.1. Simulation 1

23

We consider a semiparametric SEM defined by (1) and (3) with p = 9, q = 3, q1 = 1, and q2 = 2. For i = 1, . . . , 500 and j = 1, . . . , 9, yij =

ATj xi

+

3Tj

ϖ i + ϵij ,

ηi = g1 (zi1 ) + g2 (zi2 ) + f1 (ξi1 ) + f2 (ξi2 ) + h12 (ξi1 , ξi2 ) + δi ,

26

(23)

27

(zi2 ) = 0.2 exp(3zi2 )− 2/3, f1 (ξi1 ) = exp(3ξi1 )/[1 + exp(3ξi1 )]− 0.5, f2 (ξi2 ) = 0.5(sin(1.5ξi2 )−ξi2 ), where g1 (zi1 ) = h12 (ξi1 , ξi2 ) = 5(Φ (ξi1 ) − 0.5) cos(2π Φ (ξi2 )), and Φ (·) is the cumulative distribution function of N (0, 1); xi , zi1 , and zi2 are generated from uniform distribution U (−1, 1), 1

3Tj = 0 0

λ21

λ31

0 0

0 0

0 1 0

0

0

λ52

λ62

0

0

0 0 1

0 0

λ83

0 0

λ93

25

(22)

3 , g2 2zi1



24

28 29 30

 ,

in which 1s and 0s are fixed to obtain an identified model (Lee, 2007; Song and Lee, 2012), and λs are unknown factor loadings; 8∗ is the covariance matrix of ξ i = (ξi1 , ξi2 ). The true population values of parameters are aj = 0.6, ψj = ψδ = 0.3, j = 1, . . . , 9; λ21 = λ31 = λ52 = λ62 = λ83 = λ93 = 1, and {φ11 , φ12 , φ22 } in 8∗ are {1.0, 0.3, 1.0}. To obtain the influential cases, we replace the 250th observed variable as follows: for j = 3 and 9, y250,j = y250,j + 3Sj , where Sj is the standard deviation of yij for i = 1, . . . , 500. The prior distributions discussed in Section 3.3 are used with the following hyperparameters: the elements of Aj0 and 3j0 are taken as 0 and 0.8, respectively; 6Aj0 and 63j0 are taken as the identity matrices of proper dimensions; αj0 = 6,

31

32 33 34 35 36 37 38

8

M. Ouyang et al. / Computational Statistics and Data Analysis xx (xxxx) xxx–xxx

(a) Perturbation to the data.

(b) Perturbation to prior inputs and the sampling distribution.

Fig. 1. Local influence result in Simulation 1. Table 1 Number of correct detections among 100 replications in Simulation 1. Diagnosed component

Perturbed part

Perturbation scheme

No. of correct detections

y250 yi(−250) p(Aj ) p (3 j ) p(ψj )

Data Data Prior distribution Prior distribution Prior distribution

100 94 100 100 100

p(ψδ ) g1 (z1 ) g2 (z2 ) f1 (ξ1 ) f2 (ξ2 ) h12 (ξ1 , ξ2 )

Prior distribution Sampling distribution Sampling distribution Sampling distribution Sampling distribution Sampling distribution

ℓ(y |ϖ, θ, ωy (t )) ℓ(y |ϖ, θ, ωy (t )) ℓ(Aj |Aj0 , 6Aj0 , ωA (t )) ℓ(3j |3j0 , 63j0 , ωΛ (t )) ℓ(ψj−1 |αj0 , βj0 , ωψj (t )) ℓ(ψδ−1 |αδ0 , βδ0 , ωψδ (t )) ℓ(η|ξ, θ, ωs (t )) ℓ(η|ξ, θ, ωs (t )) ℓ(η|ξ, θ, ωs (t )) ℓ(η|ξ, θ, ωs (t )) ℓ(η|ξ, θ, ωs (t ))

100 100 98 100 100 100

Note: ωs (t ) = (ωs1 (t ), . . . , ωs5 (t ))T , ωs (0) = (0, 0, 0, 0, 0)T . 1 2 3

αδ0 = 9, βj0 = 1.8, βδ0 = 4; ρ0 = 7, R0 = 3I2 , where I2 is the two-dimensional identity matrix; ατγ d = ατβd = αbuv = 1 and βτγ d = βτβ = βbuv = 0.005. We utilize 16 equidistant knots to construct Bzk (·) and Bk (Φ (·)), and six equidistant knots d to construct Uus (Φ (·)) and Vv t (Φ (·)), leading to Kdz = Kj = 20 and S = T = 10 in (8). The second- and first-order random

23

walk priors are used to control the smoothness of the unknown univariate and bivariate smooth functions, respectively. We conduct several pilot runs to check the convergence of the MCMC algorithm and determine that it converges within 5000 observations. Thus, we collect additional 5000 observations after discarding the 5000 burn-in iterations to obtain the Bayesian estimates of model parameters and to calculate the first-order local influence measures for various perturbation schemes. We introduce perturbations to the data, prior distributions, and the sampling distribution in two stages. In the first stage, only the data are perturbed. Thus, we calculate a 500-dimensional vector of FI [vF ], where the 1st to 500th elements correspond to the 500 observations. The result of FI [vF ] is reported in the left panel of Fig. 1, which shows that the 250th observation is correctly detected as influential. In the second stage, we simultaneously perturb the prior inputs and the sampling distribution. The obtained results are reported in the right panel of Fig. 1, where the 1st to 4th elements correspond to the prior distributions of A, 3, 9, and ψδ , and the 5th to 9th elements correspond to g1 (zi1 ), g2 (zi2 ), f1 (ξi1 ), f2 (ξi2 ), and h12 (ξi1 , ξi2 ), respectively. The results show that the prior inputs are not sensitive. However, all the main effects of the covariates and the latent variables as well as the interaction effect of the latent variables are important and cannot be removed from the model. This observation agrees with the reality. We repeat the proceeding analysis based on 100 replicated data sets to summarize the empirical performance of the proposed Bayesian local influence procedure. The number of correctly diagnosed influential cases and the number of misdiagnosed cases are reported in Table 1. The following observations presented: (1) the outlier (y250 ) can be correctly identified each time. (2) Some other observations (denoted by yi(−250) ) are misdiagnosed as influential in 6 over 100 replications. (3) The effects of the prior inputs of A, 3, 9, and ψδ are not influential in each replication. (4) The main and interaction effects are correctly diagnosed as important consistently except that g2 (zi2 ) is misdiagnosed as unimportant in 2 over 100 replications. These findings indicate that the proposed method performs satisfactorily.

24

4.2. Simulation 2

4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

25 26

In this section, we conduct another simulation to examine whether the Bayesian local influence analysis erroneously diagnoses unimportant effects as influential. We consider a semiparametric SEM, where the measurement equation is the

M. Ouyang et al. / Computational Statistics and Data Analysis xx (xxxx) xxx–xxx

(a) First loop of Simulation 2.

9

(b) Second loop of Simulation 2.

Fig. 2. Local influence result in Simulation 2.

Table 2 Number of correct detections among 100 replications in Simulation 2. Diagnosed component

Perturbed term

Perturbation scheme

No. of correct detections

g1 (z1 ) ̸= 0 g2 (z2 ) = 0 f1 (ξ1 ) ̸= 0 f2 (ξ2 ) = 0 h12 (ξ1 , ξ2 ) = 0

Sampling distribution Sampling distribution Sampling distribution Sampling distribution Sampling distribution

ℓ(η|ξ, θ, ωs (t )) ℓ(η|ξ, θ, ωs (t )) ℓ(η|ξ, θ, ωs (t )) ℓ(η|ξ, θ, ωs (t )) ℓ(η|ξ, θ, ωs (t ))

100 100 96 100 100

same as (22). However, the structural equation has the following simple form:

ηi = g1 (zi1 ) + f1 (ξi1 ) + δi .

1

(24)

We generate the data from a small model (24), and Bayesian local influence procedure is performed based on a large model (23). The main purpose is to check whether the perturbation to the sampling distribution can correctly differ the important effects, such as g1 (zi1 ) and f1 (ξi1 ), from the unimportant ones, such as g2 (zi2 ), f2 (ξi2 ), and h12 (ξi1 , ξi2 ). All of the settings are similar to those of Simulation 1, except that the outlier is not considered in Simulation 2. The obtained result is reported in the left panel of Fig. 2, where the 1st to 4th elements correspond to the prior inputs of A, 3, 9, and ψδ , and the 5th to 9th elements correspond to g1 (zi1 ), g2 (zi2 ), f1 (ξi1 ), f2 (ξi2 ), and h12 (ξi1 , ξi2 ), respectively. At the first loop, the 5th element of FI [vF ] is identified as the most influential, implying that the corresponding term, g1 (zi1 ), is the most important and should be kept in the model. At the second loop, we repeat the proceeding process, but g1 (zi1 ) is left unperturbed. The result is reported in the right panel of Fig. 2. The 6th (originally 7th) element of FI [vF ] is influential but the other elements are all close to zero. This observation implies that the corresponding f1 (ξi1 ) is also important but the other effects are unimportant and should be removed from the model. Thus, we conclude that only two terms, namely, g1 (zi1 ) and f1 (ξi1 ), should be kept in SEM. Again, this observation agrees with the true model setting. Table 2 summarizes the result of Bayesian local influence in diagnosing the model structure based on 100 replicated data sets. The important terms, namely, g1 (zi1 ) and f1 (ξi1 ), as well as unimportant terms, namely, g2 (zi2 ), f2 (ξi2 ), and h12 (ξi1 , ξi2 ), are correctly identified nearly every time. The only exception is that f1 (ξ1 ) is incorrectly identified as unimportant in 4 over 100 replications. This result demonstrates the capability of the proposed method in examining model structure by perturbing the sampling distribution. 5. Application In this section, we applied the proposed method to a study of osteoporosis prevention and control. A total of 1460 Chinese men aged 65 years and above were recruited. The observed variables include spine BMD, hip BMD, estrone (E1), estrone sulfate (E1-S), estradiol (E2), testosterone (TESTO), 5-androstenediol (5-DIOL), dihydrotestosterone (DHT), androstenedione (4-DIONE), dehydroepiandrosterone (DHEA), DHEA sulfate (DHEA-S), androsterone (ADT), ADT glucuronide (ADT-G), 3a-diol-3G (3G), and 3d-diol-17G (17G). As suggested by prior medical findings (Labrie et al., 2000; Belanger et al., 2002; Vandenput et al., 2007), {Spine BMD, Hip BMD}, {E1, E1-S, E2}, {TESTO, 5-DIOL, DHT}, {4-DIONE, DHEA, DHEA-S}, and {ADT, ADT-G, 3G, 3G-17G} were grouped into latent variables BMD (η), estrogen (ξ1 ), androgen (ξ2 ), precursors (ξ3 ), and metabolites (ξ4 ), respectively. The primary goal of this study is to investigate the functional relationships between BMD and its determinants, such as estrogen, androgen, precursors, and metabolites. We applied the proposed semiparametric SEM to conduct the analysis. In the measurement equation, p = 15, q = 5, q1 = 1, q2 = 4, ϖ = (η, ξ1 , ξ2 , ξ3 , ξ4 )T , and A = 0. Given that each observed variable is clearly associated with only one

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

20

21 22 23 24 25 26 27 28 29 30 31

10

M. Ouyang et al. / Computational Statistics and Data Analysis xx (xxxx) xxx–xxx

(a) First loop.

(b) Second loop.

(c) Third loop.

(d) Fourth loop. Fig. 3. Local influence result of BMD study data.

1

latent variable, the factor loading matrix 3 has a non-overlapping structure as follows:

 2

3 4

5

1 0  0 0 0

λ2,1 0 0 0 0

0 1 0 0 0

0

0

λ4,2

λ5,2

0 0 0

0 0 0

0 0 1 0 0

0 0

0 0

λ7,3

λ8,3

0 0

0 0

0 0 0 1 0

0 0 0

0 0 0

λ10,4

λ11,4

0

0

0 0 0 0 1

0 0 0 0

λ13,5

0 0 0 0

λ14,5

0 0 0 0

λ15,5

T    , 

where 1s and 0s are fixed to obtain an identified model and to provide a clear interpretation of the latent variables. We considered the following nonparametric structural equation to examine the potential determinants of BMD:

ηi = g1 (zi1 ) + g2 (zi2 ) + f1 (ξi1 ) + f2 (ξi2 ) + f3 (ξi3 ) + f4 (ξi4 ) +



huv (ξiu , ξiv ) + δi ,

(25)

u
where zi1 and zi2 indicate the weight and age of the respondents. Eq. (25) is a large model with the main effects of the covariates and the latent variables as well as the pairwise interaction effects of the latent variables. The proposed Bayesian local influence procedure was used to detect outliers or influential observations, check the sensitivity of Bayesian result to prior inputs, and examine the structure of (25). The aforementioned continuous measurements were standardized prior to analysis. In the posterior computation, the prior distributions discussed in Section 3.3 were specified for the unknown parameters. The hyperparameters were taken to those that were assigned in the simulation study. A total of 16 equidistant knots were used to construct Bzk (·)s and Bk (Φ (·))s, and six equidistant knots were used to construct Uus (Φ (·))s and Vv t (Φ (·))s in the approximation of the univariate and bivariate unknown functions. The MCMC algorithm converged within 30,000 iterations. After convergence, we collected additional 30,000 observations for the posterior inference. We first focused on the detection of outliers or influential observations by perturbing the data only and 1460 elements of FI [vF ] were calculated. The result is reported in Fig. 3, which comprises four subplots corresponding to four loops. At the first loop, the 126th and 224th observations were identified as the most suspicious. After deleting them, some parameter ˆ 5 (the estimate of ψ5 ) dropped by 57% after deleting the two suspicious estimates significantly changed. For example, ψ cases, implying that they are indeed influential. At the second loop, we deleted the two influential cases and perturbed the ˆ 5 further dropped by remaining data. The 10th observation was identified as suspicious. After deleting this observation, ψ 37%, implying that the suspicious case is influential. At the third loop, we deleted this influential case and perturbed the

11

–0.4

–0.6

–0.4

–0.3

–0.2

–0.2

0.0

–0.1

0.2

0.0

0.4

0.1

0.6

0.2

0.8

M. Ouyang et al. / Computational Statistics and Data Analysis xx (xxxx) xxx–xxx

–2

0

2

4

(a) Estimated function of latent variable 1.

6

–3

–2

–1

0

1

2

3

4

(b) Estimated function of latent variable 2.

Fig. 4. The estimated functional effects of estrogen and androgen before (dotted lines) and after (solid lines) deleting the outliers or influential observations.

(a) First loop.

(b) Second loop.

(c) Third loop. Fig. 5. Local influence result of the prior inputs and the sampling distribution in the BMD study.

remaining data. The 322nd (originally 325th) observation was identified as suspicious. After deleting it, the change rate of

ψˆ 5 was still over 24%. Thus, 325th observation was diagnosed as influential. The fourth loop detected two other suspicious ˆ 5 did not significantly change after cases: the 32nd (originally 33rd) and 198th (originally 200th) observations. However, ψ deleting them. Thus, they were not influential, and the procedure of detecting outliers or influential observations ended. We conducted estimation before and after deleting all of the influential cases. Fig. 4 indicates that the estimated functional effects of estrogen and androgen, fˆ1 (ξi1 ) and fˆ2 (ξi2 ), are apparently different before and after deleting them. Some parameter estimates likewise changed significantly. Thus, outliers or influential points indeed have impact on estimation results.

1 2 3 4 5 6 7

12

M. Ouyang et al. / Computational Statistics and Data Analysis xx (xxxx) xxx–xxx

(a) Covariate 1.

(b) Latent variable 1.

(c) Latent variable 2.

(d) Latent variable 3.

(e) Latent variable 4. Fig. 6. Estimated functional effects of BMD determinants.

1 2 3 4 5 6

7

In the next stage, we simultaneously perturbed the sampling distribution and prior inputs. Fig. 5 reports the results. At the first loop, g1 (zi1 ) (weight) was identified as the most important. We kept it and perturbed the other terms at the second loop. Then, f1 (ξi1 ) (estrogen) was identified as the second important. We kept g1 (zi1 ) and f1 (ξi1 ), and perturbed the remaining terms at the third loop. The result indicated that f2 (ξi2 ) (androgen), f4 (ξi4 ) (metabolites), and f3 (ξ3 ) (precursors) were also influential. However, g2 (zi2 ) (age), the pairwise interaction terms, and the prior inputs were not influential. We further compared the estimation results based on model (25) and the following simple model:

ηi = g1 (zi1 ) + f1 (ξi1 ) + f2 (ξi2 ) + f3 (ξi3 ) + f4 (ξi4 ).

(26)

M. Ouyang et al. / Computational Statistics and Data Analysis xx (xxxx) xxx–xxx

13

Fig. 6 presents the estimates of all terms in (26) based on the data excluding the outliers, which indicates that the diagnosed influential effects are indeed substantially different from zero. The estimation result of (25) (not reported) shows that those that are excluded in (26) are indeed close to zero. We obtained several insightful findings. First, weight has a significant positive effect on BMD, implying that keeping appropriate weight might have prevented BMD loss. This result agrees with the previous finding in the literature (Lau et al., 2006). Likewise, the effect of estrogen on BMD is significantly positive. High level of estrogen tends to result in high level of BMD or low risk of osteoporotic fractures. However, the effect of androgen varies from positive to negative across different androgen levels. At a low androgen level, the effect is positive. It becomes negative, and the negative effect rises rapidly with the increase of androgen level. The effect of metabolites on BMD is similar to that of androgen, whereas the effect of precursors exhibits a completely different pattern. At a low level of precursors, its effect on BMD is negative. This negative effect increases and becomes positive rapidly with the increase of precursors level. Then, it stabilizes nearby zero when the precursors attain a certain level. This analysis partially reaffirms the previous results (Labrie et al., 2000; Maffei et al., 2004; Yaturu et al., 2009; Song et al., 2010). It also provides additional insight into the dynamic patterns of the effects.

1 2 3 4 5 6 7 8 9 10 11 12 13

6. Discussion

14

In this study, we developed a Bayesian local influence procedure in the context of a semiparametric SEM. We introduced a Bayesian perturbation model by perturbing p(y |ϖ, θ), p(θ), and p(ϖ|θ) to characterize perturbations to the data, prior distributions, and the sampling distribution. We use the first- and second-order local influence measures with Bayes factor as the objective function to quantify the degree of various perturbations to the interested feature of the analysis. The empirical performance of the proposed method was demonstrated with simulation studies. An application to the BMD study revealed new insights into osteoporosis prevention and control. Several possible extensions can be considered. First, this study adopted the Bayes factor as an objective function in the computation of the first-order local influence measure. The use of other objective functions, such as φ -divergence, posterior mean distance, and the χ 2 -divergence, may be worthy of further investigation to achieve high local influence efficiency. Second, the proposed model accommodates only continuous observed variables. Given the popularity of various types of data in substantive research, extending the current model framework to incorporate mixed data types is highly important. Finally, we may consider a generalization of the proposed methodology to a longitudinal setting. This advance enables us to assess the dynamic effects of time-varying covariates and latent variables on the time-varying outcome of interest.

15 16 17 18 19 20 21 22 23 24 25 26 27

Uncited references

28

Q4

Song and Lu, 2012 and Zhu et al., 2007.

29

Acknowledgments This research was supported by NSFC 11471277 and 11671349 from the National Natural Science Foundation of China, GRF grants 14305014 and 14601115 from the Research Grant Council of HKSAR, and the Direct Grant of the Chinese University of Hong Kong. The authors thank the Editor, the Associate Editor, and the two reviewers for their valuable comments, which substantially improved the paper. Appendix A. Supplementary data Supplementary material related to this article can be found online at http://dx.doi.org/10.1016/j.csda.2017.01.007. References Belanger, C., Luu-The, V., Dupont, P., Tchernof, A., 2002. Adipose tissue intracrinology: potential importance of local androgen/estrogen metabolism in the regulation of adiposity. Horm. Metab. Res. 34, 737–745. Bollen, K.A., 1989. Structural Equations with Latent Variables. Wiley, New York. Chen, J., Liu, P.F., Song, X.Y., 2013. Bayesian diagnostics of transformation structural equation models. Comput. Statist. Data Anal. 68, 111–128. Cook, R.D., 1986. Assessment of local influence (with discussion). J. R. Stat. Soc. Ser. B Stat. Methodol. 48, 133–169. Forgy, R.W., 1965. Cluster analysis of multivariate data: efficiency versus interpretability of classifications. Biometrics 21, 768–769. Ibrahim, J.G., Zhu, H.T., Tang, N.S., 2011. Bayesian local influence for survival models. Lifetime Data Anal. 17, 43–70. Labrie, F., Luu-The, V., Lin, S.X., Simard, J., Labrie, C., El-Alfy, M., Pelletier, G., Belanger, A., 2000. Intracrinology: role of the family of 17 beta-hydroxysteroid dehydrogenases in human physiology and disease. J. Mol. Endocrinol. 25, 1–16. Lang, S., Brezger, A., 2004. Bayesian P-splines. J. Comput. Graph. Statist. 13, 183–212. Lau, E.M., Leung, P.C., Kwok, T., Woo, J., Lynn, H., Orwoll, E., Cummings, S., Cauley, J., 2006. The determinants of bone mineral density in Chinese menresults from Mr. Os (Hong Kong), the first cohort study on osteoporosis in Asian men. Osteoporos. Int. 17, 297–303. Lee, S.Y., 2007. Structural Equation Modeling: A Bayesian Approach. Wiley, Chichester,UK. Lee, S.Y., Tang, N.S., 2004. Local influence analysis of nonlinear structural equation models. Psychometrika 69, 573–592. Maffei, L., Murata, Y., Rochira, V., Tubert, G., Aranda, C., Vazquez, M., Clyne, C.D., Davis, S., Simpson, E.R., Carani, C., 2004. Dysmetabolic syndrome in a man with a novel mutation of the aromatase gene: effects of testosterone, alendronate, and estradiol treatment. J. Clin. Endocrinol. Metab. 89, 61–70. McCulloch, R.E., 1989. Local model influence. J. Amer. Statist. Assoc. 84, 473–478.

30

31

Q5

32 33 34

35

36

37

38 39 40 41 42 43 44 45 46 47 48 49 50

14

1 2 3 4 5 6 7 8 9 10 11 12

M. Ouyang et al. / Computational Statistics and Data Analysis xx (xxxx) xxx–xxx

Song, X.Y., Lee, S.Y., 2004. Local influence of two-level latent variable models with continuous and polytomous data. Statist. Sinica 14, 317–332. Song, X.Y., Lee, S.Y., 2012. Basic and Advanced Bayesian Structural Equation Modeling: With Applications in the Medical and Behavioral Sciences. John Wiley & Sons, London. Song, X.Y., Lu, Z.H., 2010. Semiparametric latent variable models with Bayesian P-splines. J. Comput. Graph. Statist. 19, 590–608. Song, X.Y., Lu, Z.H., 2012. Semiparametric transformation models with Bayesian P-splines. Stat. Comput. 22, 1085–1098. Song, X.Y., Lu, Z.H., Feng, X.N., 2014. Latent variable models with nonparametric interaction effects of latent variables. Stat. Med. 33, 1723–1737. Song, X.Y., Pan, J.H., Kwok, T., Vandenput, L., Ohlsson, C., Leung, P.C., 2010. A semiparametric Bayesian approach for structural equation models. Biom. J. 52, 314–332. Tang, N.S., Duan, X.D., 2014. Bayesian influence analysis of generalized partial linear mixed models for longitudinal data. J. Multivariate Anal. 126, 86–99. Vandenput, L., Labrie, F., Mellstrom, D., Swanson, C., Knutsson, T., Peeker, R., Ljunggren, O., Orwoll, E., Eriksson, A.L., Damber, J.E., Ohlsson, C., 2007. Serum levels of specific glucuronidated androgen metabolites predict BMD and prostate volume in elderly men. J. Bone Miner. Res. 22, 220–227. Yaturu, S., Humphrey, S., Landry, C., 2009. Decreased bone mineral density in men with metabolic syndrome alone and with type 2 diabetes. Med. Sci. Monitor 15 (1), CR5–CR9. Zhu, H.T., Ibrahim, J.G., Lee, S.Y., Zhang, H.P., 2007. Perturbation selection and influence measures in local influence analysis. Ann. Statist. 35, 2565–2588. Zhu, H.T., Ibrahim, J.G., Tang, N.S., 2011. Bayesian influence analysis: A geometric approach. Biometrika 98, 307–323. Zhu, H.T., Lee, S.Y., 2001. Local influence for incomplete data models. J. R. Stat. Soc. Ser. B Stat. Methodol. 63, 111–126.