Bayesian bridge-randomized penalized quantile regression

Bayesian bridge-randomized penalized quantile regression

Journal Pre-proof Bayesian bridge-randomized penalized quantile regression Yuzhu Tian, Xinyuan Song PII: DOI: Reference: S0167-9473(19)30231-2 https...

4MB Sizes 1 Downloads 99 Views

Journal Pre-proof Bayesian bridge-randomized penalized quantile regression Yuzhu Tian, Xinyuan Song

PII: DOI: Reference:

S0167-9473(19)30231-2 https://doi.org/10.1016/j.csda.2019.106876 COMSTA 106876

To appear in:

Computational Statistics and Data Analysis

Received date : 1 March 2019 Revised date : 10 October 2019 Accepted date : 12 October 2019 Please cite this article as: Y. Tian and X. Song, Bayesian bridge-randomized penalized quantile regression. Computational Statistics and Data Analysis (2019), doi: https://doi.org/10.1016/j.csda.2019.106876. This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

© 2019 Elsevier B.V. All rights reserved.

*Manuscript Click here to view linked References

Journal Pre-proof

Bayesian Bridge-Randomized Penalized Quantile Regression Yuzhu Tiana , Xinyuan Songb,∗ a School

of Mathematics and Statistics, Henan University of Science and Technology, LuoYang; of Statistics, The Chinese University of Hong Kong, HongKong.

of

b Department

Abstract

p ro

Quantile regression (QR) is an ideal alternative for depicting the conditional quantile functions of a response variable when the conditions of linear regression are unavailable. One advantage of QR in relation to the traditional mean regression is that the QR estimates are more robust against outliers and a large class of error distributions. Regularization methods have been verified to be effective in QR literature for simultaneously conducting parameter estimation and variable selection. This study considers a bridge-

Pr e-

randomized penalty of regression coefficients by incorporating uncertainty penalty into Bayesian bridge QR. The asymmetric Laplace distribution (ALD) and the generalized Gaussian distribution (GGD) priors are imposed on model errors and regression coefficients, respectively, to establish a Bayesian bridge-randomized QR model. In addition, bridge penalty exponent is deemed as a parameter, and a Beta-distributed prior is forced on. By utilizing the normal-exponential and uniform-Gamma mixture representations of the ALD and the GGD, a Bayesian hierarchical model is constructed to conduct the fully Bayesian posterior inference. Gibbs sampler and Metropolis–Hastings algorithms are utilized to draw Markov chain Monte Carlo samples from the full conditional posterior distributions of all unknown parameters. Finally, the proposed procedures are illustrated by simulation studies and applied to a real-data analysis.

1

urn

Regularization method

al

Keywords: Bridge-randomized penalty, Hierarchical model, MCMC methods, Quantile regression,

1. Introduction

As a simple type of regression model, linear regression models have been used extensively in

3

statistical analysis and many applied fields, including but not limited to finance, economics, envi-

4

ronmental science, and society. Regression models are commonly estimated using the traditional

5

least square estimation (LSE) method, which depicts the mean function of the response variable.

6

However, for many response observations with non-normal errors or outliers, the LSE may result in

Jo

2

∗ The corresponding author is Xinyuan Song. Department of Statistics, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong. Phone number: (852)39437929. Fax number: (852)26035188. E-mail address: [email protected].

Preprint submitted to Computational Statistics & Data Analysis

October 10, 2019

Journal Pre-proof

non-robust parameter estimation. The results are not unexpected to be sensitive to outliers and/or

8

heavier-tailed errors. The estimation efficiency may be naturally reduced in this case. To solve

9

this problem, quantile regression (QR; Koenker and Bassett, 1978) is used as an alternative. With

10

the rapid development in the recent decades, QR has been an attractive statistical tool in modern

11

regression analysis. QR depicts the effects of covariates on the complete conditional distributions

12

of a response variable instead of only the average value (see Koenker, 2005; Davino et al., 2014;

13

Koenker et al., 2017 for recent reviews and further comprehensive studies on QR).

of

7

QR is traditionally solved using approaches, such as interior algorithm (Koenker and Park,

15

1996), Majorize-Minimization (MM) algorithm (Hunter and Lange, 2000), Bayesian approach (Yu

16

and Moyeed, 2001), and the Expectation-Maximization (EM) algorithm (Tian et al., 2014). Among

17

them, the Bayesian QR approach has more advantages compared with commonly used frequentist

18

approaches and has been used extensively in many complex statistical models. Yu and Moyeed

19

(2001) proposed to use the asymmetric Laplace distribution (ALD) as a working likelihood and

20

developed the Bayesian QR method to conduct the posterior inference. The ALD likelihood can

21

lead to simple computation and is easy to incorporate into the Bayesian framework for different

22

types of data. As a result, Bayesian approaches provide a convenient alternative inference tool

23

for QR. Many developments on Bayesian QR analysis are available. For example, Geraci and

24

Bottai (2007) studied QR for longitudinal data; Reich et al. (2011) considered Bayesian spatial

25

QR models; Kobayashi and Kozumi (2013) considered Bayesian QR analysis of censored dynamic

26

panel data; Zhao and Lian (2015) considered Bayesian Tobit QR for single-index models; Huang

27

(2016) studied Bayesian semiparametric quantile mixed effects models with complex data; Tian et

28

al. (2016) studied Bayesian joint QR for mixed effects models with multiple data feature.

al

Pr e-

p ro

14

In regression analysis, many covariates may be brought into models. A parsimonious model that

30

only retains significant covariates is highly desirable. Various variable selection procedures have

31

been proposed to solve this problem by using some information criteria methods, such as Akaike

32

information criterion (AIC), Bayesian information criterion (BIC), and other recent penalized

33

regularization methods. Many regularization methods, including least absolute shrinkage and select

34

operator (LASSO; Tibshirani, 1996), adaptive LASSO (Zou, 2006), smoothly clipped absolute

35

deviation (SCAD; Fan and Li, 2001), elastic-net (Enet; Zou and Hastie, 2005), bridge penalized

36

regression (Fu, 1998; Knight and Fu, 2000; Huang et al., 2008), and L1/2 -norm penalization (Xu et

37

al., 2010), can be used to conduct variable selection and model estimation simultaneously. These

38

approaches impose a penalty on the size of the regression coefficients, which makes it possible to

39

estimate parameters in the presence of a large number of variables and a relatively small number

40

of observations.

Jo

urn

29

2

Journal Pre-proof

With the rapid development of Bayesian statistical computation, many penalized regularization

42

methods can be conveniently conducted in a Bayesian framework. For example, Park and Casella

43

(2008) proposed Bayesian LASSO (BLASSO) regression; Alhamzawi et al. (2012) considered

44

Bayesian adaptive LASSO (BALASSO) QR; Gefan (2014) discussed Bayesian doubly adaptive Enet

45

LASSO for VAR shrinkage; Polson et al. (2014) studied Bayesian bridge regression; Betancourt et

46

al. (2017) studied Bayesian fused Lasso regression for dynamic binary networks; Alhamzawi and

47

Ali (2018) discussed Bayesian tobit QR with L1/2 penalty; Alhamzawi and Algamal (2018) studied

48

Bayesian bridge QR with a fixed penalty index ξ = 1/2; and Mallick and Yi (2018) considered

49

Bayesian bridge regression using L1/2 -norm regularization. However, different penalized methods

50

present different statistical properties and variable selection efficiency. Xu et al. (2010) showed

51

that Lξ (ξ ∈ (0, 1]) regularizer possess many desirable statistical properties, such as oracle, sparsity,

52

and unbiasedness.

p ro

of

41

In the present work, we are interested in Bayesian bridge-randomized QR based on Lξ (ξ ∈ (0, 1])

54

regularizer. For bridge QR regression with Lξ -norm regularization, penalty exponent ξ is deemed

55

as a parameter that can be estimated in Bayesian framework. For example, a Beta family prior

56

is forced on this penalty exponent ξ, and the Bayesian hierarchical QR model is established to

57

conduct statistical inference.

Pr e-

53

The remainder of this paper is organized as follows. Section 2 presents the considered model and

59

working likelihood. In Section 3, we establish a Bayesian hierarchical model of bridge-randomized

60

penalized QR. In Section 4, we consider its adaptive version. Section 5 presents some numerical

61

studies to illustrate the proposed methods. Section 6 applies the proposed methodology to a

62

real-life study. Finally, Section 7 concludes the paper.

63

2. Model and working likelihood

urn

64

al

58

We consider the following linear regression model: yt = xTt β + εt , t = 1, · · · , n,

(1)

65

where xt = (xt1 , · · · , xtp )T is a p-dimensional regression covariates, β = (β1 , · · · , βp )T is the

66

regression parameter vector, and εt is the error term. For any quantile level τ ∈ (0, 1), the τ th conditional quantile of the response yt can be defined as

Jo

67

68

Quantileτ (yt |xt ) = xTt β. According to Koenker and Bassett (1978), QR estimator can be obtained

69

by minimizing the following loss objective function: Q0 (β) =

n ∑ t=1

ρτ (yt − xTt β), 3

(2)

Journal Pre-proof

70

where ρτ (u) = u(τ − I(u < 0)) is the quantile check function.

71

In QR modeling, to select significant covariates for improving prediction accuracy, we can select

72

various penalty functions p(·), such as L1 penalty (LASSO and adaptive LASSO), L2 penalty

73

(ridge regression), Lξ (0 < ξ < 1) penalty (bridge regression), Enet, and SCAD, which result in

74

the penalized QR loss objective function as follows:

75

(3)

of

Q(β) = Q0 (β) + pλ (|β|), where λ > 0 is a penalty parameter, and pλ (|β|) is a penalty function on β.

Among the above penalized regression, the bridge penalized regression is an important regu-

77

larization method that utilizes the Lξ -norm penalty. Generally, the regularization regression with

78

Lξ -norm penalty of QR regression models can be expressed as follows: min β

n ∑ t=1

ρτ (yt −

xTt β)

p ro

76



p ∑ i=1

|βi |ξ ,

(4)

where λ > 0 is the tuning parameter, and ξ is the penalty exponent of the Lξ -norm penalization

80

function.

81

Pr e-

79

For bridge penalized regression (4), penalization function pλ (|β|) = λ

∑p

i=1

|βi |ξ (ξ ≥ 0) includes

many popular penalties as follows: the best subset selection if ξ = 0, LASSO regression if ξ = 1,

83

ridge regression if ξ = 2, bridge regression if 0 < ξ < 1, and L1/2 regularizer if ξ = 1/2. Among

84

the Lξ -norm regularizers, bridge regularizer has many desirable statistical properties. Huang et

85

al. (2008) studied the asymptotic properties of bridge regression. Xu et al. (2010) showed that

86

bridge regression possesses oracle, sparsity, and unbiasedness. The author also revealed that L1/2

87

penalty is the most sparse and robust among the Lξ (1/2 ≤ ξ ≤ 1) and has similar properties

88

to the Lξ (0 < ξ < 1/2) regularizers. In a Bayesian framework, Polson et al. (2014) recently

89

proposed a Bayesian bridge estimator for regularized regression; Alhamzawi and Algamal (2018)

90

studied Bayesian bridge QR with fixed penalty index ξ = 1/2; and Mallick and Yi (2018) considered

91

Bayesian bridge regression and provided sufficient conditions for strong posterior consistency under

92

a sparsity assumption on a high-dimensional parameter.

urn

al

82

In the aforementioned studies, ξ is generally regarded as a fixed constant, and the Lξ regularizers

94

for 0 < ξ ≤ 1 outperform the Lξ regularizers for 1 < ξ ≤ 2 in simulations. That is, Lξ regularization

95

with a smaller value of ξ results in better solution compared with that with a larger value of ξ.

96

In the literature, the best value for penalty index ξ in Lξ -norm regularization functions is located

97

in the interval (0, 1]. Naturally, ξ can be considered as an unknown parameter and be selected

98

using a data-driven method to obtain the optimal penalty. However, no study has ever focused on

99

selection of the optimal ξ for bridge penalty regression. In a frequentist framework, selecting an

Jo

93

4

Journal Pre-proof

optimal ξ by minimize the complex objective loss function is difficult. Bayesian methods provide

101

an available alternative. In a Bayesian framework, ξ can be treated as a random variable and

102

can be estimated using MCMC algorithm. In this study, we aim to conduct Bayesian statistical

103

inference based on uncertain penalty exponent ξ. We rename the Bayesian bridge QR regression

104

with unknown ξ as the Bayesian bridge-randomized (BRBridge) QR regression by forcing bridge-

105

randomized regularization priors on regression parameters. We consider bridge QR estimators (4)

106

and its adaptive version as follows: min

n ∑ t=1

ρτ (yt − xTt β) +

p ∑ i=1

λi |βi |ξ ,

of

100

(5)

where ξ is a penalty exponent valued in the interval (0, 1], which represents concave bridge penalty.

108

In a Bayesian QR framework, we need to specify a working likelihood for the model error.

109

According to Yu and Moyeed (2001), maximizing the likelihood under ALD errors is equivalent to

110

minimizing the objective loss function (2) of QR. ALD has its probability density function (pdf)

111

as follows:

p ro

107

{ τ (1 − τ ) y−µ } exp − ρτ ( ) , σ σ where µ is the location, σ is the scale, and 0 < τ < 1 is the skewness.

112

113

Pr e-

f (y|µ, σ, τ ) =

The conditional likelihood of QR model (1) can be equivalently defined as n { [ y − xT β ]} ∏ τ (1 − τ ) t t H(β, σ|(xT1 , y1 ), · · · , (xTn , yn )) = exp − ρτ . σ σ t=1

(6)

(7)

114

For the conditional likelihood function (7), we suffer computation difficulty due to the inherent

115

non-differentiability of QR check function. Nevertheless, Kozumi and Kobayashi (2011) provided

where θ1 =

1−2τ τ (1−τ ) , θ2

=

al

117

a scale mixture of Gaussian representation of model (1), as shown as follows: √ yt = xTt β + θ1 υt + θ2 συt · et , t = 1, 2, · · · , n, 2 τ (1−τ ) , υt

(8)

∼ Exp( σ1 ), et ∼ N (0, 1), and υt and et are independent of

urn

116

118

one another. Hence, the resulting conditional distribution of yt is normally distributed with mean

119

xTt β + θ1 υt and variance θ2 συt .

120

121

On the basis of the mixture representation (8), the joint hierarchical likelihood of the complete data {y, x, υ} is given by

n [ ∏

Jo

L(β, σ|y, x, v) =

t=1

{ (y − xT β − θ υ )2 } 1 { 1 }] 1 t 1 t t √ √ exp − exp − υt . 2θ2 συt σ σ 2π θ2 συt

(9)

122

We need to specify bridge penalization prior for regression coefficient to conduct the fully

123

Bayesian analysis of bridge regularized models (4) and (5). The implementation of BRBridge and

124

its adaptive version (BARBridge) will be presented in the following section. 5

Journal Pre-proof

125

3. BRBridge and its adaptive version

126

3.1. Priors

Nadarajah, 2006) priors on the regression coefficients, as shown as follows: π(β|σ, λ, ξ) =

p ∏

i=1 129

130

131

π(βi |σ, λ, ξ), π(βi |σ, λ, ξ) =

{ λ } ( σλ )1/ξ exp − |βi |ξ , 2Γ(1 + 1/ξ) σ

where λ > 0, and ξ ∈ (0, 1].

(10)

of

128

To analyze BRBridge QR model (4), we impose generalized Gaussian distribution (GGD,

By combining GGD priors into the conditional likelihood function (7), we obtain a marginal posterior density of β as follows: π(β|y, x) ∝ L(β, σ|y, x, v)π(β|σ, λ, ξ) n ∏

t=1

exp

{

{ λ∑ } (yt − xTt β − θ1 υt )2 } exp − |βi |ξ . 2θ2 συt σ i=1



p

(11)

Pr e-



p ro

127

We observe that the estimator of β obtained in (4) amounts to the posterior mode of marginal

133

density (11). However, the posterior of (11) is analytically intractable because it is difficult to

134

calculate due to bridge penalty. Polson et al. (2014) presented a mixture representation of the

135

GGD with a two-component mixture of Gamma distributions. Mallick and Yi (2018) studied

136

Bayesian bridge regression and proposed a new mixture representation of the GGD, in which the

137

mixing distribution is a particular Gamma distribution. This mixture representation is simpler

138

and more efficient than that of Polson et al. (2014). The mixture representation of the GGD

139

proposed by Mallick and Yi (2018) has the following form:

al

132

141

142





|x|ξ

{ } 1 λ1+1/ξ u1/ξ exp − λu du. 2u1/ξ Γ(1 + 1/ξ)

By using the above mixture representation, we decompose the prior of βi in (10) as ∫ ∞ π(βi |σ, λ, ξ) = π(βi |λ, σ, si , ξ)π(si |ξ)dsi ,

(13)

) ( 1/ξ 1/ξ where π(βi |λ, σ, si , ξ) = Uniform[−si , si ], and π(si |ξ) = Gamma 1 + 1ξ , σλ , i = 1, · · · , p. The priors of other parameters are set as follows:

π(ξ) ∼ Beta(e, f ), π(λ) ∼ Gamma(a, b), π(σ) ∼ IGamma(c, d),

143

(12)

0

Jo

140

urn

{ } λ1/ξ exp − λ|x|ξ = 2Γ(1 + 1/ξ)

where IGamma(·, ·) denotes the inverse Gamma distribution.

6

(14)

Journal Pre-proof

144

145

146

3.2. Posterior inference On the basis of the hierarchical likelihood (9) and the preceding priors, we formulate the hierarchical model as follows: y n×1 |X, β, σ ∼ L(β, σ|y, x, v), β|S, λ, σ, ξ ∼

i=1

1/ξ

1/ξ

Uniform[−si , si ], S = {si , i = 1, · · · , p},

( 1 λ) Gamma 1 + , , ξ σ i=1 p ∏

of

S|λ, σ, ξ ∼

p ∏

p ro

λ ∼ Gamma(a, b),

(15)

σ ∼ IGamma(c, d), ξ ∼ Beta(e, f ).

Let θ = {β, S, λ, σ, ξ} be a vector of unknown parameter. The Bayesian estimate of θ can be obtained through the mean or mode of the posterior samples drawn from p(θ|y, x, v). However,

Pr e-

p(θ|y, x, v) is indirectly tractable because v is also random. Thus, we use the data augmentation and Gibbs sampler to iteratively sample from p(θ|y, x, v) and p(v|y, x, θ). Given that θ includes multiple components, p(θ|y, x, v) is still complicated. The Gibbs sampler is used further to simulate each unknown of θ from its full conditional distribution. Based on the hierarchical model and prior specifications, the full conditional distributions can be obtained as follows: π(σ|y, x, υ, β, λ, ξ) ∼ IGamma(δ, η),

al

p ∑ ( ) π(λ|y, x, υ, β, σ, ξ) ∼ Gamma p(1 + 1/ξ) + a, b + si /σ .

urn

( 1 η 2 θ2 + 2θ ) 2 π(υt |y, x, β, σ, λ, ξ) ∼ GIG , t , 1 , 2 θ2 σ θ2 σ p ∏ 1/ξ ∗ ∗ π(β|y, x, υ, σ, λ) ∼ N (β , B ) I(|βi | ≤ si ), π(S|y, X, υ, β, σ, λ) ∼

p ∏

i=1

149

150

3n 2

Exp(λ/σ)I{si ≥ |βi |ξ },

+ p(1 + 1/ξ) + c, η =

Jo

148

where δ = ηt = y t −

xTt β,

(16)

i=1

π(ξ|β, σ, λ, y, x) ∝ ξ e−1 (1 − ξ)f −1 147

i=1

∑n

t=1

(

p p ∏ (λ/σ) ξ I{0 ≤ ξ ≤ 1} I{|βi |ξ ≤ si }, [Γ(1 + 1/ξ)]p i=1

e2t 2θ2 υt

) ∑p + υt + λ i=1 si + d, et = yt − xTt β − θ1 υt ;

GIG(λ, χ, ψ) denotes the generalized inverse Gaussian distribution with index λ (∑ ) (∑ )−1 xt xT n n xt y˜t ∗ t and scale parameters χ > 0 and ψ > 0; β ∗ = B ∗ , B = , and t=1 θ2 συt t=1 θ2 συt y˜t = yt − θ1 υt .

7

Journal Pre-proof

151

In the full conditional distributions of (16), π(σ|·), π(λ|·), and π(vt |·) are familiar distributions,

152

and sampling from them is straightforward and fast. On the contrary, π(β|·) is a multivariate

153

truncated normal distribution. Sampling from π(β|·) can be implemented by iteratively sampling

154

from multiple conditional univariate truncated normal distributions (Devroye, 1986). π(si |·) is a

155

left-truncated exponential distribution, sampling from which is accomplished by using the inverse

156

transformation method with two substeps: (a) generate s∗i ∼ Exp(λ/σ) and (b) let si = s∗i +

|βi |ξ , i = 1, · · · , p. π(ξ|·) is a nonstandard and complex distribution. A random-walk Metropolis

158

algorithm (Polson et al., 2014) is employed to sample from π(ξ|·).

159

3.3. Adaptive version of BRBridge QR

p ro

of

157

160

In this section, we consider BARBridge QR, an adaptive version of BRBridge QR. Unlike

161

BRBridge QR that imposes a single penalty parameter on all regression coefficients, BARBridge

162

introduces a different penalty to each regression coefficient. Specifically, the prior distributions of

163

the regression coefficients are defined as follows: p ∏

i=1

{ λ } ( λσi )1/ξ i exp − |βi |ξ , 2Γ(1 + 1/ξ) σ

Pr e-

π(β|σ, Λ, ξ) =

π(βi |σ, λi , ξ), π(βi |σ, λi , ξ) =

(17)

164

where λi > 0 is the penalty parameter imposed on βi , Λ = (λ1 , · · · , λp ), and ξ ∈ (0, 1]. Further-

165

more, the prior distributions of S and Λ are assigned as follows: S|Λ, σ, ξ ∼

( 1 λi ) Gamma 1 + , , ξ σ i=1 p ∏

Λ∼

p ∏

Gamma(ai , bi ),

(18)

i=1

where S, σ, and ξ are the same as those in (15), and ai and bi are hyperparameters. The full

167

conditional distributions can be derived in a similar manner as in Section 3.2. The implementation

168

of the Gibbs sampler and MCMC algorithm is likewise similar. The details are omitted.

al

166

Compared with BRBridge QR that equally penalizes all regression coefficients, BARBridge

170

allows data-driving penalties, which automatically introduce high penalties to unimportant coeffi-

171

cients and low penalties to important ones, thereby enabling highly flexible and efficient variable

172

selection and parameter estimation, especially in sparse cases.

173

4. Numerical studies

174

4.1. Simulation 1

Jo

urn

169

175

In this section, we conduct simulation studies to assess the finite sample performance of the

176

proposed Bayesian estimation procedures. We generate 100 datasets from Equation (1) in the

177

following cases with sample size of n = 100 and 200 as follows: 8

Journal Pre-proof

10000

15000

20000

0

0

5000

10000

15000

20000

10000

0.0 1.0 2.0

0.0 1.0 2.0

5000

15000

20000

1.0 0.0

1.0

5000

10000

20000

5000

10000

15000

20000

5000

10000

15000

20000

15000

20000

xi

0.0 0

15000

beta_6

0

sigma

10000

beta_4

0

beta_5

0

5000

0.0 1.0 2.0

0.0 1.0 2.0

beta_3

of

5000

p ro

0

beta_2

0.0 1.0 2.0

0.0 1.0 2.0

beta_1

15000

20000

0

5000

10000

Pr e-

Figure 1: MCMC chains starting from different initial values under Model 1: n = 100, εt ∼ t3 , and τ = 0.5.

Model 1 (Dense case): β = (0.85, 0.85, 0.85, 0.85, 0.85, 0.85)T ;

179

Model 2 (Sparse case): β = (0.85, 0, 0.85, 0, 0, 0)T .

180

Predictors xt = (xt1 , · · · , xtp )T , t = 1, · · · , n, where xt1 , · · · , xtp are generated independently

181

from a standard normal distribution. Meanwhile, standard normal distribution (N (0, 1)) and

182

heavy-tailed t distribution with 3 degree of freedom (t3 ) are considered for model errors. Three

183

quantile levels, 0.25, 0.50, and 0.75, are considered in all simulations. The hyperparameters of

184

the Gamma, inverse Gamma, and Beta priors discussed in Section 3 are set as follows: (Prior 1)

185

a = b = c = d = 0.1 and e = f = 1.

urn

al

178

186

To guide the MCMC convergence, we first conduct a few test runs using different initial values

187

in the setting of Model 1: n = 100, εt ∼ t3 , and τ = 0.5. We consider three different sets

188

of initial values, namely, initial 1: {β = (−3, −3, −3, −3, −3, −3)T , σ = 1, ξ = 1/3}; initial 2:

{β = (1, 1, 1, 1, 1, 1)T , σ = 5, ξ = 1/2}; and initial 3: {β = (3, 3, 3, 3, 3, 3)T , σ = 10, ξ = 2/3}.

190

Figure 1 presents the three MCMC chains starting from these initial values. The rapid mixing of

191

the MCMC chains indicates a quick convergence of the MCMC algorithm. Figures 2 and 3 present

192

the trace and density plots of 20,000 posterior samples of one MCMC chain of the regression and

193

penalty parameters, which reconfirm that the MCMC chains of all parameters rapidly converge to

194

their stationary distributions. We also depict the autocorrelation function (ACF) plot to check the

Jo

189

9

Journal Pre-proof

0

5000

0.0 1.0

5000

15000

0.0 1.0

beta_6

Trace plot

5000

15000

Density plot

Density plot

1.2

1.4

0.4

1.0

1.2

0.6

0.4

0.6

0.8

1.0

1.0

1.2

0.0 2.0

beta_6

1.2

0.8

Density plot

1.2

0.6

Pr e-

1.0

0.8

0.0 2.0

beta_5 0.8

Density plot

Density plot

0.0 2.0

0.6

0.6

15000

p ro

1.0

5000 Index

0.0 2.0

0.0 2.0

0.8

0

beta_3

Index

of

0

0 2 4

15000

0.0 1.0

Trace plot

beta_5

Trace plot

Density plot

beta_4

0

Index

Index

0.6

0.4

15000 Index

5000

0.4

beta_3

15000 Index

beta_2

0

Trace plot

0.0 1.0

beta_2 5000

0.0 1.0

beta_4

0

beta_1

Trace plot

0.0 1.0

beta_1

Trace plot

0.8

1.0

1.2

1.4

Figure 2: Trace plot of 20,000 Gibbs samples and density plots of 1000 posterior samples of regression coefficients under Model 1: n = 100, εt ∼ t3 , and τ = 0.5.

3

4

Density plot

1

2

tau=0.25

1.0 0.5

0

0.0

tau=0.25

1.5

Trace plot

0

5000

10000

15000

20000

0.4

0.6

0.8

1.0

al

Index

0

5000

10000

15000

2

3

4

Density plot

0

1

tau=0.50

urn

1.0 0.5 0.0

tau=0.50

1.5

Trace plot

20000

0.2

0.4

0.6

0.8

1.0

Index

5000

10000

4 3 2 1 0

Jo 0

Density plot

tau=0.75

1.0 0.5 0.0

tau=0.75

1.5

Trace plot

15000

20000

0.2

0.4

0.6

0.8

1.0

Index

Figure 3: Trace plot of 20,000 Gibbs samples and density plots of 1000 posterior samples of penalty index ξ under Model 1: n = 100, εt ∼ t3 , and τ = 0.25, 0.50, and 0.75.

10

Journal Pre-proof beta_2

0.0 0.8

0.0 0.8

beta_1

0

10

20

30

40

0

10

20

30

40

30

40

beta_4

0.0 0.8

0.0 0.8

beta_3

10

20

30

40

0

10

20

0.0 0.8

beta_6

0.0 0.8

beta_5

0

10

20

30

40

0

10

10

20

30

30

40

p ro

0.0 0.8 0

20

40

xi

0.0 0.8

sigma

of

0

0

10

20

30

40

Figure 4: ACF plot under Model 1: n = 100, εt ∼ t3 , and τ = 0.5. The horizontal axis denotes the number

Pr e-

of MCMC iterations.

autocorrelation between posterior samples in the abovementioned setting. Figure 4 shows that the

196

autocorrelation between posterior samples rapidly declines to zero. Hence, we use all samples after

197

burn-ins instead of taking every qth (q > 1) iterated samples in a long chain in the subsequent

198

analyses. To be conservative, for each simulation we run the Gibbs sampling algorithm 20,000

199

iterations. Then, we discard the first 5,000 iterations as burn-ins and perform Bayesian inference

200

using the remaining 15,000 posterior samples. In the simulations, the MH acceptance rate for

201

penalty exponent ξ is approximately 15% − 30% for all cases.

al

195

On the basis of 100 replications, the averaged estimation biases (Bias), root mean square errors

203

(RMSE), 95% equal-tailed confidence lower limits (CL) and upper limits (UL) of the regression

204

coefficients obtained by using BRBridge and BARBridge QR over the three quantile levels are

205

presented in Tables 1–2 (Model 1) and 3–4 (Model 2). Meanwhile, the averaged estimation values,

206

standard error estimates, and the 95% equal-tailed credible intervals of ξ are presented in the last

207

columns of these tables (written in bold). Tables 1 and 2 show that BRBridge and BARBridge

208

QR procedures can estimate parameters accurately for two error distributions under sample sizes

209

of n = 100 and n = 200. In the case of τ = 0.5, BRBridge and BARBridge QR yield superior

210

estimation results with smaller biases and RMSE for most of parameters under N (0, 1) error than

211

under heavy-tailed t3 error over the three quantiles. The proposed procedures can provide nearly

212

unbiased estimation results for all regression coefficients. As the sample size n increases, the

213

estimation results improve with smaller RMSE. For dense coefficients (Model 1), the proposed

Jo

urn

202

11

Journal Pre-proof

Table 1: BRBridge QR results of Model 1.

Error

τ

Parameter

β1 = 0.85

β2 = 0.85

β3 = 0.85

β4 = 0.85

β5 = 0.85

β6 = 0.85

ξ

n = 100 −0.028 0.146 0.529 1.091

0.010 0.126 0.616 1.096

−0.017 0.141 0.563 1.097

−0.036 0.120 0.622 1.028

−0.028 0.137 0.595 1.093

−0.016 0.140 0.565 1.102

0.802 0.004 0.795 0.811

0.50

Bias RMSE 95% CL 95% CU

−0.007 0.116 0.647 1.071

−0.029 0.117 0.613 1.033

−0.015 0.111 0.620 1.042

−0.014 0.121 0.597 1.050

−0.021 0.111 0.637 1.020

−0.002 0.104 0.637 1.027

0.803 0.005 0.794 0.811

0.75

Bias RMSE 95% CL 95% CU

0.003 0.133 0.617 1.052

0.006 0.131 0.594 1.103

0.006 0.128 0.642 1.100

−0.014 0.132 0.584 1.093

−0.015 0.110 0.581 1.023

−0.046 0.147 0.544 1.035

0.803 0.005 0.794 0.811

0.25

Bias RMSE 95% CL 95% CU

−0.017 0.162 0.515 1.114

−0.035 0.158 0.485 1.139

−0.036 0.174 0.506 1.126

−0.032 0.155 0.548 1.090

−0.005 0.175 0.517 1.153

−0.030 0.158 0.528 1.074

0.802 0.006 0.789 0.812

0.50

Bias RMSE 95% CL 95% CU

−0.027 0.120 0.614 1.048

−0.028 0.129 0.584 1.039

−0.037 0.139 0.538 1.084

−0.017 0.122 0.637 1.076

−0.033 0.129 0.604 1.069

−0.031 0.134 0.544 1.069

0.803 0.005 0.791 0.812

0.75

Bias RMSE 95% CL 95% CU

0.004 0.157 0.563 1.126

−0.019 0.190 0.472 1.149

−0.038 0.171 0.506 1.103

−0.014 0.153 0.521 1.150

−0.054 0.165 0.516 1.117

0.027 0.149 0.551 1.056

0.802 0.006 0.789 0.811

of

Bias RMSE 95% CL 95% CU

p ro

t3

0.25

0.25

Bias RMSE 95% CL 95% CU

−0.013 0.086 0.699 1.029

Pr e-

N (0, 1)

0.006 0.098 0.664 1.055

−0.014 0.095 0.649 1.031

−0.010 0.089 0.664 1.027

−0.016 0.097 0.672 0.991

−0.011 0.094 0.674 1.016

0.804 0.004 0.796 0.811

0.50

Bias RMSE 95% CL 95% CU

−0.005 0.079 0.732 1.025

−0.028 0.088 0.640 0.969

−0.004 0.086 0.697 1.009

−0.022 0.079 0.674 0.963

−0.015 0.091 0.665 1.002

0.000 0.083 0.688 1.009

0.805 0.005 0.795 0.814

0.75

Bias RMSE 95% CL 95% CU

0.010 0.094 0.700 1.038

−0.018 0.094 0.676 1.008

−0.005 0.091 0.661 1.023

−0.008 0.084 0.667 0.997

0.002 0.081 0.703 1.003

0.000 0.101 0.664 1.042

0.805 0.005 0.795 0.814

Bias RMSE 95% CL 95% CU

−0.017 0.124 0.585 1.041

−0.031 0.124 0.585 1.043

−0.008 0.109 0.645 1.071

−0.008 0.115 0.626 1.056

−0.018 0.117 0.630 1.051

−0.025 0.103 0.618 0.979

0.805 0.005 0.795 0.813

Bias RMSE 95% CL 95% CU

−0.015 0.089 0.678 0.985

−0.020 0.101 0.650 1.003

−0.011 0.089 0.669 0.994

−0.018 0.099 0.663 1.055

−0.008 0.082 0.667 0.995

−0.008 0.097 0.649 1.030

0.805 0.005 0.797 0.813

Bias RMSE 95% CL 95% CU

−0.014 0.110 0.659 1.055

−0.015 0.116 0.614 1.030

−0.025 0.111 0.598 1.008

−0.011 0.113 0.619 1.040

0.011 0.126 0.588 1.095

−0.020 0.111 0.625 1.033

0.804 0.005 0.795 0.813

t3

0.25

0.50

Jo

0.75

urn

N (0, 1)

al

n = 200

Note: The last column presents the averaged Bayesian estimates, standard error estimates, and the 95% equal-tailed credible intervals of ξ on the basis of 100 replications.

12

Journal Pre-proof

Table 2: BARBridge QR results of Model 1

Error

τ

Parameter

β1 = 0.85

β2 = 0.85

β3 = 0.85

β4 = 0.85

β5 = 0.85

β6 = 0.85

ξ

n = 100 −0.021 0.149 0.560 1.122

−0.009 0.144 0.582 1.091

0.000 0.142 0.581 1.147

−0.004 0.142 0.567 1.120

−0.015 0.123 0.609 1.040

−0.026 0.137 0.542 1.069

0.866 0.004 0.859 0.872

0.50

Bias RMSE 95% CL 95% CU

−0.006 0.111 0.652 1.048

−0.013 0.134 0.571 1.085

−0.009 0.118 0.620 1.061

−0.019 0.130 0.624 1.087

−0.028 0.116 0.604 1.003

−0.021 0.131 0.580 1.096

0.868 0.004 0.861 0.874

0.75

Bias RMSE 95% CL 95% CU

−0.015 0.129 0.555 1.061

−0.007 0.132 0.603 1.105

−0.024 0.132 0.580 1.073

−0.026 0.130 0.556 1.020

0.004 0.121 0.636 1.080

−0.015 0.120 0.623 1.091

0.867 0.004 0.860 0.873

0.25

Bias RMSE 95% CL 95% CU

−0.020 0.174 0.486 1.154

−0.033 0.176 0.515 1.144

−0.023 0.159 0.500 1.116

−0.025 0.153 0.545 1.092

−0.025 0.167 0.499 1.113

0.012 0.189 0.472 1.183

0.868 0.004 0.861 0.875

0.50

Bias RMSE 95% CL 95% CU

−0.023 0.121 0.595 1.053

−0.036 0.150 0.552 1.153

−0.022 0.144 0.531 1.143

−0.005 0.137 0.600 1.074

−0.028 0.136 0.552 1.048

−0.007 0.120 0.606 1.071

0.870 0.003 0.865 0.877

0.75

Bias RMSE 95% CL 95% CU

−0.021 0.184 0.466 1.165

−0.020 0.187 0.470 1.165

−0.055 0.177 0.532 1.117

−0.001 0.160 0.547 1.124

−0.027 0.178 0.488 1.168

−0.071 0.187 0.435 1.097

0.868 0.004 0.860 0.876

of

Bias RMSE 95% CL 95% CU

p ro

t3

0.25

0.25

Bias RMSE 95% CL 95% CU

−0.012 0.094 0.659 0.999

Pr e-

N (0, 1)

−0.008 0.096 0.673 1.004

0.006 0.097 0.676 1.023

−0.012 0.100 0.657 1.003

−0.014 0.097 0.658 1.015

−0.011 0.082 0.678 1.010

0.866 0.003 0.861 0.873

0.50

Bias RMSE 95% CL 95% CU

0.008 0.092 0.671 1.017

−0.009 0.088 0.668 1.021

0.007 0.080 0.707 1.006

−0.003 0.086 0.668 0.993

−0.010 0.088 0.676 0.988

−0.021 0.092 0.663 0.967

0.867 0.004 0.861 0.875

0.75

Bias RMSE 95% CL 95% CU

−0.009 0.087 0.679 1.003

−0.005 0.092 0.675 1.025

−0.003 0.083 0.683 0.987

−0.015 0.092 0.675 1.055

−0.007 0.086 0.675 1.006

−0.008 0.100 0.675 1.026

0.867 0.004 0.861 0.874

Bias RMSE 95% CL 95% CU

−0.013 0.118 0.620 1.058

0.004 0.104 0.635 1.078

−0.003 0.113 0.636 1.031

−0.021 0.119 0.620 1.094

−0.011 0.110 0.613 1.072

−0.030 0.119 0.596 1.020

0.869 0.003 0.863 0.874

Bias RMSE 95% CL 95% CU

−0.015 0.086 0.665 0.974

−0.002 0.089 0.673 1.004

−0.005 0.085 0.700 0.987

−0.016 0.091 0.659 0.997

−0.012 0.098 0.680 1.028

−0.006 0.089 0.677 1.030

0.870 0.004 0.863 0.877

Bias RMSE 95% CL 95% CU

−0.003 0.126 0.616 1.098

−0.001 0.121 0.649 1.121

−0.011 0.108 0.624 1.025

0.001 0.108 0.622 1.033

0.000 0.122 0.594 1.069

−0.002 0.121 0.636 1.078

0.868 0.003 0.863 0.876

t3

0.25

0.50

Jo

0.75

urn

N (0, 1)

al

n = 200

Note: The last column presents the averaged Bayesian estimates, standard error estimates, and the 95% equal-tailed credible intervals of ξ on the basis of 100 replications.

13

Journal Pre-proof

Table 3: BRBridge QR results of Model 2 Error

τ

Parameter

β1 = 0.85

β2 = 0

β3 = 0.85

β4 = 0

β5 = 0

β6 = 0

ξ

n = 100 −0.035 0.144 0.531 1.057

0.012 0.088 −0.142 0.185

−0.021 0.141 0.542 1.080

0.005 0.097 −0.201 0.210

−0.008 0.101 −0.218 0.182

0.012 0.103 −0.202 0.198

0.616 0.060 0.503 0.705

0.50

Bias RMSE 95% CL 95% CU

−0.029 0.124 0.581 1.086

0.001 0.086 −0.174 0.173

−0.034 0.129 0.544 1.012

−0.003 0.072 −0.146 0.130

0.005 0.084 −0.159 0.203

0.004 0.090 −0.186 0.181

0.589 0.086 0.407 0.690

0.75

Bias RMSE 95% CL 95% CU

−0.033 0.137 0.562 1.070

−0.009 0.101 −0.236 0.237

−0.044 0.123 0.570 1.023

−0.004 0.098 −0.202 0.210

−0.012 0.077 −0.176 0.121

−0.006 0.088 −0.179 0.195

0.602 0.066 0.473 0.705

0.25

Bias RMSE 95% CL 95% CU

−0.056 0.175 0.416 1.103

0.003 0.100 −0.188 0.213

−0.036 0.167 0.474 1.122

0.000 0.104 −0.225 0.186

0.017 0.115 −0.162 0.247

−0.034 0.133 −0.323 0.187

0.640 0.064 0.505 0.728

0.50

Bias RMSE 95% CL 95% CU

−0.045 0.143 0.561 1.114

−0.007 0.083 −0.178 0.138

−0.070 0.154 0.485 1.046

−0.008 0.097 −0.225 0.154

0.013 0.098 −0.170 0.247

−0.008 0.097 −0.252 0.173

0.627 0.047 0.516 0.701

0.75

Bias RMSE 95% CL 95% CU

−0.040 0.173 0.449 1.134

0.006 0.133 −0.255 0.290

−0.024 0.154 0.536 1.142

0.001 0.101 −0.206 0.160

0.003 0.111 −0.269 0.243

−0.005 0.102 −0.198 0.207

0.634 0.063 0.498 0.725

of

Bias RMSE 95% CL 95% CU

p ro

t3

0.25

Pr e-

N (0, 1)

n = 200

−0.021 0.095 0.677 1.001

−0.008 0.070 −0.172 0.126

−0.029 0.100 0.642 0.991

0.008 0.076 −0.145 0.174

0.006 0.073 −0.121 0.171

0.018 0.060 −0.073 0.146

0.553 0.091 0.331 0.663

0.50

Bias RMSE 95% CL 95% CU

0.002 0.090 0.661 1.015

−0.001 0.063 −0.126 0.121

−0.015 0.081 0.706 1.010

0.001 0.059 −0.129 0.103

−0.004 0.068 −0.120 0.124

0.006 0.067 −0.130 0.144

0.558 0.068 0.388 0.668

0.75

Bias RMSE 95% CL 95% CU

−0.021 0.084 0.673 0.974

0.006 0.068 −0.137 0.138

−0.008 0.085 0.702 1.000

0.001 0.076 −0.163 0.154

0.011 0.082 −0.141 0.180

−0.001 0.071 0.145 0.180

0.551 0.095 0.362 0.679

al

Bias RMSE 95% CL 95% CU

0.25

Bias RMSE 95% CL 95% CU

−0.010 0.119 0.616 1.100

−0.012 0.078 −0.170 0.117

−0.011 0.118 0.620 1.037

0.002 0.088 −0.180 0.148

−0.003 0.089 −0.171 0.190

0.005 0.092 −0.172 0.191

0.603 0.057 0.497 0.704

0.50

Bias RMSE 95% CL 95% CU

−0.020 0.092 0.680 1.019

0.004 0.057 −0.119 0.105

−0.006 0.089 0.688 1.033

0.006 0.056 −0.111 0.119

−0.003 0.068 −0.133 0.134

−0.015 0.074 −0.185 0.108

0.567 0.074 0.392 0.672

0.75

Bias RMSE 95% CL 95% CU

−0.045 0.126 0.619 1.025

−0.003 0.079 −0.154 0.156

−0.018 0.137 0.602 1.103

−0.004 0.088 −0.176 0.211

−0.007 0.086 −0.203 0.140

0.006 0.082 −0.160 0.154

0.600 0.070 0.483 0.697

Jo

t3

0.25

urn

N (0, 1)

Note: The last column presents the averaged Bayesian estimates, standard error estimates, and the 95% equal-tailed credible intervals of ξ on the basis of 100 replications.

14

Journal Pre-proof

Table 4: BARBridge QR results of Model 2 Error

τ

Parameter

β1 = 0.85

β2 = 0

β3 = 0.85

β4 = 0

β5 = 0

β6 = 0

ξ

n = 100 −0.012 0.145 0.583 1.132

−0.003 0.084 −0.208 0.145

−0.012 0.126 0.596 1.043

−0.006 0.078 −0.162 0.170

0.50

Bias RMSE 95% CL 95% CU

−0.034 0.124 0.589 1.064

0.001 0.083 −0.136 0.168

−0.022 0.120 0.625 1.058

−0.001 0.070 −0.151 0.147

0.75

Bias RMSE 95% CL 95% CU

0.011 0.134 0.595 1.096

0.004 0.083 −0.154 0.161

−0.033 0.153 0.516 1.095

0.005 0.076 −0.171 0.172

0.25

Bias RMSE 95% CL 95% CU

−0.032 0.174 0.502 1.167

−0.017 0.098 −0.233 0.138

−0.028 0.156 0.531 1.144

0.50

Bias RMSE 95% CL 95% CU

0.001 0.084 −0.200 0.156

0.75

Bias RMSE 95% CL 95% CU

−0.006 0.089 −0.193 0.154

−0.022 0.153 0.532 1.089 −0.037 0.178 0.477 1.165

0.000 0.081 −0.205 0.176

−0.009 0.101 −0.219 0.191

0.786 0.034 0.729 0.828

0.003 0.077 −0.143 0.145

−0.010 0.081 −0.161 0.166

0.787 0.022 0.748 0.821

0.009 0.090 −0.183 0.201

−0.001 0.089 −0.198 0.191

0.785 0.034 0.729 0.834

−0.005 0.107 −0.259 0.183

−0.001 0.096 −0.216 0.224

0.016 0.090 −0.164 0.240

0.796 0.020 0.759 0.832

−0.011 0.131 0.570 1.076

−0.002 0.087 −0.184 0.182

0.010 0.073 −0.132 0.155

−0.011 0.083 −0.206 0.162

0.797 0.019 0.750 0.832

−0.037 0.178 0.483 1.090

0.013 0.102 −0.189 0.263

0.018 0.106 −0.172 0.262

0.000 0.102 −0.210 0.190

0.800 0.022 0.752 0.840

of

Bias RMSE 95% CL 95% CU

p ro

t3

0.25

Pr e-

N (0, 1)

n = 200

−0.011 0.094 0.656 1.006

0.006 0.051 −0.109 0.121

0.013 0.098 0.684 1.030

−0.003 0.068 −0.139 0.141

0.009 0.056 −0.100 0.143

−0.009 0.048 −0.100 0.070

0.767 0.032 0.707 0.817

0.50

Bias RMSE 95% CL 95% CU

0.004 0.077 0.726 1.010

0.003 0.054 −0.100 0.116

−0.015 0.093 0.671 0.987

0.000 0.052 −0.100 0.097

0.001 0.048 −0.107 0.089

−0.003 0.053 −0.102 0.096

0.771 0.028 0.712 0.819

0.75

Bias RMSE 95% CL 95% CU

−0.012 0.086 0.684 1.009

−0.001 0.056 −0.123 0.137

0.003 0.085 0.696 1.020

0.000 0.059 −0.115 0.132

0.007 0.064 −0.106 0.161

0.002 0.061 −0.115 0.131

0.768 0.037 0.686 0.820

al

Bias RMSE 95% CL 95% CU

0.25

Bias RMSE 95% CL 95% CU

−0.004 0.113 0.629 1.040

−0.008 0.064 −0.131 0.119

0.006 0.103 0.679 1.060

0.004 0.079 −0.168 0.217

−0.004 0.068 −0.157 0.131

−0.004 0.066 −0.128 0.124

0.784 0.025 0.730 0.823

0.50

Bias RMSE 95% CL 95% CU

−0.008 0.099 0.648 1.009

−0.004 0.058 −0.121 0.131

−0.017 0.105 0.649 1.035

0.002 0.063 −0.096 0.146

0.000 0.068 −0.135 0.154

−0.011 0.070 −0.201 0.104

0.781 0.023 0.738 0.816

0.75

Bias RMSE 95% CL 95% CU

0.004 0.115 0.606 1.062

−0.012 0.068 −0.153 0.121

−0.003 0.112 0.655 1.075

0.006 0.068 −0.149 0.133

−0.002 0.072 −0.141 0.143

0.016 0.076 −0.118 0.210

0.784 0.023 0.741 0.818

Jo

t3

0.25

urn

N (0, 1)

Note: The last column presents the averaged Bayesian estimates, standard error estimates, and the 95% equal-tailed credible intervals of ξ on the basis of 100 replications.

15

Journal Pre-proof

214

methods tend to select a large penalty index ξ, and the estimated values of ξ range from 0.8 to

215

0.9 across different quantiles and error distributions. Based on the density plot of the posterior

216

samples of ξ in Figure 3, the posterior distribution of ξ tends to be left-skewed.

217

Model errors can be evaluated by the following criteria based on the 100 replications: n ( ) 1∑ T ˆ (|xi (β − β true )|)], MMSE = Mean (βˆ − β true )T (βˆ − β true ) . n i=1

(19)

of

MMAD = Mean[

The MMADs and MMSEs of BRBridge and BARBridge regularization methods over the three

219

quantiles under Models 1 and 2 are presented in Table 5. For comparison, the MMADs and MM-

220

SEs of six existing methods, namely, Bayesian ridge (BRidge), BLASSO, Bayesian L1/2 (BL1/2 ),

221

Bayesian adaptive ridge (BARidge), BALASSO, and Bayesian adaptive L1/2 (BAL1/2 ) QR, are

222

also provided. Table 5 shows that BRBridge and BARBridge QR are not consistently optimal

223

compared with the other six classes of regularization methods for dense coefficients (Model 1) at

224

the 0.25th quantile. However, at the median and the 0.75th quantile, BRBridge and BARBridge

225

QR show certain advantages with small MMAD or MMSE for many cases.

Pr e-

p ro

218

For sparse Model 2, we aim to examine whether BRBridge and BARBridge QR can specify zero

227

coefficients exactly. Likewise, the convergence, trace, and density plots of posterior samples (not

228

reported) indicate a rapid convergence of the MCMC algorithm. In particular, for zero coefficients

229

β2 , β4 , β5 , and β6 , the MCMC samples of BRBridge and BARBridge QR can converge to zeros

230

stationarily. Tables 3 and 4 show that BRBridge and BARBridge QR procedures can specify

231

significant covariates and zero-valued coefficients accurately for different error distributions over

232

the three quantiles. Similar to the conclusion of Model 1, the estimation results under the normal

233

error outperform those under the t3 error over the three quantiles. In addition, comparison of the

234

estimation results of BRBridge and BARBridge QR indicates that the latter can estimate zero

235

coefficients more efficiently with smaller RMSE and shorter confidence intervals of covering zeros.

236

Hence, the proposed procedures can yield unbiased and sparse estimation results for all regression

237

coefficients. For sparse coefficients, the posterior density plot of penalty exponent ξ (not reported)

238

shows that its posterior distribution tends to be left-skewed on interval (0,1) and similar across

239

quantiles. Tables 3 and 4 show that the estimated values of ξ range from 0.6 to 0.8 across different

240

quantiles and error distributions, thereby indicating that the proposed methods select smaller ξ for

241

sparse coefficients than for dense coefficients. Moreover, the optimal value of penalty exponent ξ

242

for Bayesian bridge regression ranges mostly from 0.5 to 1. This phenomenon is consistent with the

243

conclusion of Xu (2010). Our proposed BRBridge and BARBridge QR can be regarded as a balance

244

between L1/2 - and LASSO-regularized regression. Based on Table 5, BRBridge and BARBridge

245

QR perform superior to other methods due to smaller MMAD or MMSE for a majority of cases. For

Jo

urn

al

226

16

Journal Pre-proof

246

sparse coefficients, the results of BRBridge and BARBridge QR are close to the results of BL1/2

247

and BAL1/2 QR. They both outperform BLASSO and BALASSO QR, respectively. Moreover,

248

when other conditions are the same, BARBridge QR outperforms BRBridge QR due to smaller

249

MMAD or MMSE for all cases. The simulations are conducted using a laptop [Dell Intel(R) Core(TM) i7-6600U CPU] and

251

statistical software R3.5.2. The R code is freely available upon request. In the setting of Model

252

1: n = 100, εt ∼ t3 , and τ = 0.5, the computing times of BRBridge and BARBridge QR for

253

completing one replication are 1.718 and 1.850 minutes, respectively.

254

4.2. Simulation 2

p ro

of

250

255

In this section, we conduct additional simulations to assess the performance of the proposed

256

methods under different prior inputs and model settings. Unless otherwise noted, we focus on

257

n = 100, εt ∼ t3 , and τ = 0.5 in the subsequent analyses.

We first conduct a sensitivity analysis to check whether different prior inputs influence the

259

results of posterior samples. The hyperparameters in Prior 1 are disturbed as follows: (Prior 2)

260

a = 1, b = 2, c = 1, d = 10, e = 1/2, and f = 2. The results obtained under Prior 2 are similar to

261

those reported in Tables 1–4 and not reported.

Pr e-

258

262

As suggested by an anonymous reviewer, we consider additional model settings as follows:

263

Model 3 (High correlation case): p = 6, β = (0.85, 0.85, 0.85, 0.85, 0.85, 0.85)T , and xt ∼

265

266

N (0, Σ), where Σ is a correlation matrix with off-diagonal elements 0.5. Model 4 (High-dimensional case): p = 60, β1:6 = (0.85, 0.85, 0.85, 0.85, 0.85, 0.85)T , β7:60 =

al

264

(0, · · · , 0)T , and xt ∼ N (0, Σ), where Σ is an identity matrix. Model 5 (Small n large p case): p > n, e.g., p = 150 and n = 100.

268

Based on BRBridge and BARBridge QR, we collect 15,000 posterior samples after discarding

269

5,000 burn-ins to perform Bayesian inference. The results of Models 3 and 4 are reported in Table

270

6. Compared with those in Models 1 and 2, the performances of the proposed methods are equally

271

good in Model 3, unsatisfactory when n = 100 but acceptable when n = 200 in Model 4, and

272

breaking down in Model 5. Furthermore, we notice that the estimated values of ξ in Model 4

273

are significantly smaller than those in Models 1 and 2, thereby reconfirming that the proposed

274

methods select smaller ξ for sparse coefficients than for dense coefficients. Hence, routinely fixing

275

ξ to a constant is unreasonable and may reduce estimation efficiency.

Jo

urn

267

17

Journal Pre-proof

Table 5: MMAD, MMSE and their standard errors (in parentheses) based on 100 replications

τ

Sample size

n = 100

n = 100

n = 200

n = 200

Error

N (0, 1)

t3

N (0, 1)

t3

Method

MMAD

MMSE

MMAD

MMSE

MMAD

MMSE

MMAD

MMSE

Model 1

0.75

.105(.073) .091(.053) .100(.063) .109(.060)

.281(.083) .294(.080) .298(.098) .295(.087)

.144(.091) .153(.082) .163(.127) .160(.094)

.164(.048) .165(.051) .172(.057) .170(.049)

.048(.027) .048(.029) .053(.035) .052(.031)

.188(.061) .202(.059) .208(.066) .211(.060)

.064(.041) .072(.043) .077(.049) .079(.045)

BARidge BALASSO BAL1/2 BARBridge

.239(.067) .240(.077) .228(.068) .253(.074)

.104(.059) .106(.073) .095(.060) .116(.069)

.299(.085) .284(.080) .312(.085) .307(.091)

.165(.098) .145(.081) .174(.100) .172(.118)

.171(.043) .164(.049) .163(.048) .172(.051)

.050(.024) .047(.029) .045(.025) .053(.030)

.200(.062) .209(.070) .214(.063) .207(.063)

.071(.048) .080(.061) .081(.048) .077(.048)

BRidge BLASSO BL1/2 BRBridge

.214(.057) .220(.061) .222(.064) .209(.066)

.084(.046) .086(.051) .087(.049) .077(.044)

.237(.077) .244(.080) .238(.074) .231(.075)

.104(.070) .113(.082) .103(.064) .099(.064)

.155(.043) .148(.044) .163(.051) .156(.044)

.042(.022) .039(.024) .047(.029) .042(.023)

.174(.062) .175(.054) .175(.054) .169(.052)

.055(.041) .054(.034) .054(.033) .051(.033)

BARidge BALASSO BAL1/2 BARBridge

.222(.072) .215(.069) .215(.070) .222(.065)

.089(.061) .083(.055) .085(.053) .091(.054)

.243(.082) .246(.074) .261(.087) .243(.085)

.113(.080) .113(.068) .127(.096) .109(.077)

.163(.045) .153(.046) .154(.045) .161(.048)

.046(.025) .042(.025) .041(.024) .046(.028)

.174(.049) .170(.051) .181(.057) .166(.047)

.052(.028) .051(.029) .058(.036) .048(.027)

BRidge BLASSO BL1/2 BRBridge

.234(.072) .222(.074) .236(.069) .236(.066)

.102(.066) .089(.059) .103(.064) .101(.054)

BARidge BALASSO BAL1/2 BARBridge

.238(.070) .234(.067) .241(.071) .233(.069)

.104(.060) .097(.054) .105(.063) .097(.060)

of

.236(.081) .223(.064) .236(.076) .247(.071)

p ro

0.50

BRidge BLASSO BL1/2 BRBridge

Pr e-

0.25

.284(.079) .300(.088) .295(.097) .294(.095)

.149(.087) .161(.098) .157(.098) .161(.111)

.162(.047) .164(.048) .166(.048) .168(.048)

.046(.027) .047(.027) .049(.029) .049(.027)

.210(.070) .201(.060) .206(.074) .209(.067)

.080(.051) .072(.042) .077(.057) .078(.052)

.290(.092) .283(.084) .288(.090) .301(.100)

.153(.100) .146(.088) .160(.108) .140(.125)

.164(.051) .169(.052) .173(.049) .166(.047)

.048(.029) .050(.030) .052(.028) .048(.026)

.196(.061) .202(.057) .206(.069) .204(.064)

.070(.044) .073(.041) .077(.058) .072(.048)

Model 2

0.75

.094(.056) .083(.051) .070(.053) .073(.051)

BARidge BALASSO BAL1/2 BARBridge

.200(.064) .180(.066) .159(.066) .157(.071)

.074(.048) .062(.043) .050(.041) .057(.048)

.286(.082) .263(.083) .221(.074) .232(.086)

.146(.086) .125(.072) .089(.057) .091(.063)

.168(.057) .149(.046) .140(.049) .143(.050)

.050(.032) .040(.024) .036(.026) .038(.028)

.207(.060) .195(.054) .173(.058) .176(.057)

.074(.044) .067(.041) .052(.034) .054(.036)

.255(.078) .225(.073) .197(.102) .206(.094)

.118(.072) .093(.062) .085(.111) .082(.079)

.207(.060) .138(.044) .130(.056) .129(.046)

.077(.047) .035(.022) .033(.027) .031(.022)

.207(.060) .174(.064) .148(.059) .150(.058)

.077(.047) .054(.037) .041(.032) .042(.031)

al

.226(.071) .213(.064) .194(.068) .201(.069)

urn

0.50

BRidge BLASSO BL1/2 BRBridge

BRidge BLASSO BL1/2 BRBridge

.202(.059) .197(.067) .173(.059) .171(.067)

.076(.046) .072(.050) .057(.045) .059(.048)

.245(.073) .218(.076) .197(.061) .201(.074)

.110(.067) .092(.063) .070(.040) .079(.057)

.154(.044) .141(.045) .134(.043) .131(.045)

.041(.024) .036(.023) .032(.022) .031(.020)

.171(.049) .160(.049) .146(.054) .135(.043)

.051(.030) .045(.029) .039(.032) .033(.023)

BARidge BALASSO BAL1/2 BARBridge

.190(.055) .170(.063) .157(.061) .158(.057)

.065(.041) .054(.042) .046(.035) .043(.037)

.222(.075) .194(.067) .169(.071) .167(.072)

.096(.065) .069(.051) .058(.058) .057(.049)

.161(.052) .125(.041) .111(.046) .114(.041)

.045(.026) .028(.017) .023(.018) .022(.018)

.161(.052) .142(.046) .117(.051) .112(.049)

.045(.026) .036(.023) .027(.026) .030(.028)

BRidge BLASSO BL1/2 BRBridge

.229(.066) .211(.068) .207(.076) .189(.067)

.095(.054) .081(.056) .082(.066) .067(.049)

.283(.076) .275(.080) .235(.091) .233(.083)

.144(.076) .137(.079) .106(.085) .103(.079)

.163(.050) .155(.041) .142(.051) .141(.051)

.046(.028) .042(.021) .036(.027) .036(.025)

.199(.061) .189(.069) .169(.067) .166(.063)

.071(.040) .064(.047) .054(.045) .052(.042)

BARidge BALASSO BAL1/2 BARBridge

.206(.058) .176(.070) .171(.078) .173(.070)

.076(.042) .068(.046) .057(.050) .059(.055)

.282(.082) .226(.076) .197(.075) .206(.094)

.148(.099) .094(.062) .076(.064) .073(.066)

.195(.063) .139(.052) .122(.049) .124(.047)

.069(.044) .036(.028) .027(.020) .028(.021)

.195(.063) .160(.059) .157(.059) .158(.060)

.069(.044) .047(.034) .045(.032) .045(.033)

Jo

0.25

18

Journal Pre-proof

Model 3, n = 100

0.50

0.75

β1

β2

β3

β4

β5

β6

ξ

MMAD

MMSE

Bias

−0.048

−0.063

−0.060

−0.031

−0.019

−0.039

0.795

0.322

0.268

RMSE

0.232

0.196

0.220

0.210

0.192

0.223

0.009

0.103

0.163

95% CL

0.292

0.394

0.455

0.421

0.425

0.345

0.772

95% CU

1.214

1.117

1.172

1.165

1.094

1.120

0.808

p ro

0.25

Para

Bias

−0.073

−0.036

−0.026

−0.019

−0.073

−0.040

0.798

0.296

0.214

RMSE

0.181

0.205

0.185

0.185

0.206

0.175

0.009

0.100

0.145

95% CL

0.443

0.389

0.487

0.447

0.318

0.519

0.779

95% CU

1.088

1.283

1.194

1.153

1.075

1.110

0.811

Pr e-

τ

of

Table 6: BRBridge QR results of Models 3 and 4: εt ∼ t3 , and τ = 0.5

Bias

−0.019

−0.058

−0.057

−0.047

−0.020

−0.044

0.795

0.324

0.275

RMSE

0.225

0.218

0.212

0.209

0.205

0.221

0.008

0.104

0.160

95% CL

0.407

0.320

0.438

0.418

0.370

0.364

0.779

95% CU

1.196

1.159

1.156

1.130

1.175

1.234

0.807

Model 4, n = 200

0.50

β1

β2

β3

β25

β35

β45

β55

ξ

MMAD

MMSE

Bias

−0.080

−0.086

−0.074

0.004

0.004

0.005

−0.013

−0.010

0.485

0.380

0.302

RMSE

0.186

0.160

0.174

0.047

0.068

0.053

0.054

0.046

0.059

0.074

0.128

95% CL

0.410

0.509

0.483

−0.088

−0.136

−0.091

−0.148

−0.099

0.398

95% CU

1.092

1.006

1.040

0.088

0.120

0.124

0.084

0.062

0.610

Bias

−0.086

−0.092

−0.071

−0.005

−0.002

−0.001

0.003

0.000

0.498

0.334

0.230

RMSE

0.154

0.150

0.147

0.039

0.036

0.048

0.045

0.047

0.062

0.078

0.109

95% CL

0.517

0.529

0.519

−0.075

−0.072

−0.083

−0.091

−0.100

0.401

0.985

0.963

1.064

0.075

0.061

0.123

0.084

0.091

0.621

Bias

−0.098

−0.08

−0.101

−0.003

0.000

0.000

−0.003

0.005

0.514

0.379

0.313

RMSE

0.174

0.160

0.189

0.050

0.053

0.055

0.040

0.050

0.073

0.098

0.163

0.462

0.472

0.379

−0.126

−0.091

−0.107

−0.105

−0.081

0.419

1.024

0.999

1.033

0.073

0.105

0.127

0.070

0.120

0.675

95% CU 0.75

β15

al

0.25

Para

95% CL 95% CU

urn

τ

Jo

Note: The columns “MMAD” and “MMSE” present their averages and standard deviations based on 100 replications.

19

Journal Pre-proof

276

5. Application In this section, we applied the proposed estimation procedures to analyze the Boston housing

278

dataset. This dataset was named as “BostonHousing” and can be downloaded using R pack-

279

age “mlbench” by inputting the following commands in R Console: data(”BostonHousing”, pack-

280

age=”mlbench”), original=BostonHousing. The original data include 506 observations and 13

281

predictors variables. The outcome variable is the median value of owner-occupied homes in USD

282

1,000’s (medv). The 13 predictor variables are as follows: per capita crime rate (crim); propor-

283

tion of residential land zone for lots over 25,000 square feet (zn); proportion of non-retail business

284

acres per town (indus); Charles River dummy variable (chas); nitric oxide concentration (parts

285

per 10 million, nox); average number of rooms per dwelling (rm); proportion of owner-occupied

286

units built prior to 1940 (age); weighted distances to five Boston employment centers (dis); in-

287

dex of accessibility to radial highways (rad); full-value property-tax rate per USD 10,000 (tax),

288

pupil–teacher ratio by town (ptratio); 1, 000(B − 0.63)2 , where B is the proportion of blacks by

289

town (bk); and percentage of lower status of the population (lstat).

Pr e-

p ro

of

277

290

We fitted the Boston housing dataset using simple linear regression with a constant term. The

291

ordinary LSE indicated that the linear regression model was a proper model. We partitioned the

292

whole dataset into a training set (first 456 samples) and test set (last 50 samples). Model fitting

293

294

295

was conducted on the training set, and performance is evaluated on the test set, which can be ∑506 1 T ˆ evaluated by MAD = 50 i=457 |yi − xi β|. Quantile levels 0.25, 0.50, and 0.75 were considered.

The convergence plot (not reported) showed that the MCMC algorithm converged within 10,000 iterations. Thus, for each case we ran the Gibbs sampling algorithm 20,000 iterations, discarded

297

the first 10,000 iterations as burn-ins, and used the remaining 10,000 samples to perform posterior

298

inference. Table 7 presents the parameter estimates and their standard error estimates (se). All

299

the methods under consideration specified a sparse regression model for the housing data. The

300

estimation results are similar to those reported in the literature (Alhamzawi and Algamal, 2018).

urn

301

al

296

Table 8 presents the MADs based on BRBridge, BARBridge, BRidge, BLASSO, BL1/2 , BARidge, BALASSO, and BAL1/2 QR. As shown in Table 8, BRBridge QR performs similarly to BL1/2 QR

303

over the three quantiles and has better prediction accuracy than BRidge and BLASSO QR due to

304

smaller MAD values. Meanwhile, BARBridge QR performs slightly better than BAL1/2 , BARidge,

305

and BALASSO QR over the three quantiles. In addition, nearly all the adaptive versions of the

306

estimation methods considered perform better than their non-adaptive versions.

Jo

302

20

0.75

0.50

−0.160(0.719) 0.015(0.057)

−0.015(0.018)

−0.424(0.367)

−0.289(0.214)

−0.058(0.117) 0.034(0.040)

−0.033(0.183)

6.061(1.278) −0.023(0.043) −0.799(0.527) 0.131(0.236) −0.014(0.013) −0.502(0.341) 0.012(0.010) −0.297(0.168) −0.025(0.124) 0.045(0.047) −0.015(0.200)

6.303(1.023) −0.010(0.040) −0.858(0.520) 0.239(0.269) −0.014(0.012) −0.554(0.400)

−0.012(0.020)

−0.154(0.292)

0.009(0.013)

−0.173(0.199)

−0.044(0.124)

0.030(0.048)

−0.022(0.149)

0.396(0.941)

0.193(0.398)

4.899(2.149)

−0.019(0.047)

−0.538(0.481)

0.089(0.214)

−0.013(0.017)

−0.334(0.324)

0.011(0.012)

−0.246(0.185)

−0.017(0.117)

0.038(0.049)

−0.013(0.140)

0.577(1.149)

0.189(0.472)

5.742(1.576)

−0.010(0.044)

−0.653(0.484)

0.171(0.237)

−0.012(0.016)

−0.413(0.347)

tax

ptratio

bk

latat

crim

zn

indus

chas

nox

rm

age

dis

rad

tax

ptratio

bk

latat

crim

zn

indus

chas

nox

rm

age

dis

rad

tax

ptratio

21

al

0.015(0.011) −0.338(0.165)

0.014(0.013)

−0.297(0.175)

bk

0.136(1.768)

2.091(2.307)

0.204(1.552)

1.238(1.975)

0.011(0.013)

latat

urn

0.032(0.184)

0.081(0.260)

−0.763(0.734)

−0.259(0.431)

dis

−0.256(0.520)

−0.006(0.074)

0.113(0.649)

−0.276(0.850)

0.002(0.214)

0.893(1.279)

0.243(0.626)

0.120(0.990)

0.015(0.560)

0.058(0.211)

−0.010(0.466)

−0.262(0.384)

0.012(0.031)

−0.150(0.546)

−0.013(0.043)

0.090(0.452)

−0.210(0.650)

−0.013(0.128)

0.651(1.027)

−0.344(0.159)

0.015(0.011)

−0.563(0.332)

−0.014(0.013)

0.247(0.265)

−0.878(0.483)

−0.013(0.043)

6.307(0.979)

0.149(1.275)

1.893(2.032)

−0.015(0.181)

0.047(0.044)

−0.019(0.149)

−0.303(0.160)

0.012(0.009)

−0.509(0.307)

−0.014(0.013)

0.145(0.232)

−0.810(0.455)

−0.024(0.041)

6.004(1.132)

0.238(0.981)

1.202(1.616)

−0.034(0.164)

0.036(0.040)

−0.067(0.119)

−0.294(0.196)

0.010(0.013)

−0.441(0.343)

−0.015(0.016)

0.082(0.250)

−0.750(0.567)

−0.032(0.050)

5.442(1.431)

0.250(1.480)

0.987(1.641)

−0.048(0.209)

0.029(0.051)

−0.106(0.143)

Est (se.)

BARidge Est (se)

BLASSO

6.103(1.048)

0.209(1.222)

1.287(1.797)

−0.031(0.150)

0.034(0.038)

−0.059(0.109)

−0.287(0.183)

0.011(0.011)

−0.429(0.341)

−0.015(0.015)

0.082(0.231)

−0.746(0.558)

−0.030(0.045)

5.530(1.340)

0.299(1.874)

0.957(1.961)

−0.038(0.172)

0.028(0.046)

−0.092(0.128)

Est (se)

BALASSO

−0.281(0.307)

0.014(0.028)

−0.352(0.523)

−0.011(0.036)

0.167(0.417)

−0.579(0.690)

−0.012(0.104)

4.174(2.466)

0.214(0.563)

0.566(1.179)

−0.015(0.311)

0.047(0.112)

−0.007(0.263)

−0.248(0.263)

0.012(0.020)

−0.247(0.419)

−0.013(0.027)

0.083(0.309)

−0.408(0.574)

−0.019(0.079)

BL1/2

−0.337(0.213)

0.104(0.013)

−0.547(0.467)

−0.014(0.018)

0.233(0.306)

−0.869(0.677)

−0.013(0.060)

6.301(1.103)

0.105(3.260)

2.111(2.698)

−0.017(0.199)

−0.292(0.170)

0.014(0.012)

−0.433(0.351)

−0.013(0.015)

0.176(0.229)

−0.667(0.482)

−0.011(0.045)

5.799(1.485)

0.181(0.493)

0.561(1.125)

−0.010(0.145)

0.041(0.045)

−0.017(0.122)

−0.249(0.167)

0.012(0.011)

0.347(0.312)

−0.013(0.015)

0.091(0.192)

−0.550(0.457)

−0.020(0.041)

4.906(2.136)

0.197(0.383)

0.383(0.827)

−0.022(0.129)

0.030(0.042)

−0.046(0.102)

−0.189(0.191)

0.009(0.012)

−0.182(0.285)

−0.012(0.018)

0.036(0.180)

−0.320(0.441)

−0.018(0.048)

2.884(2.614)

0.130(0.373)

0.205(0.685)

−0.024(0.147)

0.024(0.048)

−0.058(0.104)

Est (se)

of 0.045(0.056)

−0.020(0.165)

−0.300(0.161)

0.012(0.010)

−0.512(0.294)

−0.014(0.012)

0.132(0.211)

−0.797(0.472)

−0.023(0.038)

p ro

2.865(2.539)

0.181(0.453)

0.326(0.907)

−0.023(0.246)

0.038(0.083)

−0.059(0.192)

−0.174(0.241)

0.011(0.019)

−0.074(0.319)

−0.013(0.028)

0.030(0.250)

−0.114(0.392)

−0.015(0.080)

0.644(1.417)

0.079(0.301)

0.071(0.517)

−0.022(0.223)

0.032(0.082)

−0.069(0.172)

Pr e-

0.168(0.522)

0.090(0.759)

−0.022(0.397)

0.046(0.132)

−0.062(0.305)

−0.236(0.299)

0.011(0.024)

−0.079(0.393)

−0.015(0.034)

0.059(0.344)

−0.111(0.460)

−0.015(0.105)

rad

0.308(0.597)

5.512(1.545)

−0.030(0.052)

2.366(2.625)

−0.016(0.052)

rm

0.085(0.376)

0.252(1.743)

0.049(0.526)

−0.034(0.306)

age

Jo

0.103(0.332)

nox

0.964(1.840)

0.177(0.643)

chas

0.036(0.112)

0.027(0.049) −0.042(0.195)

0.024(0.058)

−0.022(0.147)

zn

indus

−0.093(0.228)

−0.091(0.135)

−0.053(0.117)

crim

0.25

BRidge Est (se)

Est (se)

Est (se)

BARBridge

BRBridge

Method

Parameter

τ

Table 7: Estimation results of regression coefficients for Boston housing dataset

−0.336(0.202)

0.014(0.013)

−0.538(0.420)

−0.013(0.017)

0.219(0.298)

−0.853(0.678)

−0.010(0.050)

6.305(1.265)

0.199(4.432)

2.076(2.990)

−0.008(0.254)

0.042(0.061)

−0.015(0.229)

−0.299(0.179)

0.012(0.010)

−0.490(0.333)

−0.014(0.012)

0.128(0.231)

−0.812(0.548)

−0.020(0.038)

6.026(1.151)

0.246(2.424)

1.329(2.368)

−0.027(0.153)

0.032(0.041)

−0.053(0.121)

−0.279(0.195)

0.010(0.010)

−0.426(0.379)

−0.014(0.014)

0.068(0.228)

−0.720(0.611)

−0.028(0.041)

5.550(1.373)

0.238(1.999)

0.936(2.118)

−0.036(0.171)

0.024(0.046)

−0.082(0.117)

Est (se)

BAL1/2

Journal Pre-proof

Journal Pre-proof

Table 8: MAD values of Boston housing dataset of different methods

τ = 0.50

τ = 0.75

BRidge

28.33

25.37

16.10

BLASSO

24.02

14.29

4.855

BL1/2

14.97

4.724

4.029

BRBridge

15.24

4.282

4.106

BARidge

7.084

3.359

4.898

BALASSO

5.838

3.189

4.324

BAL1/2

5.134

2.917

4.137

BARBridge

5.115

3.026

4.239

p ro

6. Conclusion

of

τ = 0.25

Pr e-

307

Method

In this study, we consider BRBridge and BARBridge QR. Simulations and a real-data example

309

are conducted to illustrate the proposed procedures. The two Bayesian penalized procedures

310

perform well in general and can specify and estimate significant regression variables for dense

311

and sparse cases. In the simulations, penalty index ξ has a left-skewed posterior distribution

312

on the interval (0, 1) for dense and sparse coefficients. However, the average estimation value of

313

ξ is mostly close to 1 and 0.5 for dense and sparse coefficients, respectively. For future work,

314

this phenomenon must be verified from a theoretical perspective. As shown in the simulations,

315

the proposed methods cannot manage the case of small n large p (p > n), thereby restricting

316

the application of our methods to the analysis of ultrahigh dimensional data. How to address

317

this problem is certainly of future research interest. Moreover, in this study we set the prior

318

distribution of penalty exponent ξ as a standard uniform distribution. Other prior distributions

319

of ξ, such as truncated normal or truncated Gamma distribution, for improving the estimation

320

efficiency of BRBridge and BARBridge QR are worthy of further investigation. Furthermore, we

321

illustrate the proposed BRBridge and BARBridge QR in the context of a linear regression model.

322

The proposed methods can apparently be applied to nonlinear models and many other statistical

323

models. Finally, several other methods assess the performance of the quantile estimation without

324

fixing τ . For example, the composite quantile regression (CQR) proposed by Zou and Yuan (2008)

325

assesses the estimation accuracy of the quantile function by averaging a series of quantile regression

326

estimators at different quantiles. Owing to its advantages in integrating various quantile levels to

327

improve estimation, CQR and its variants have received considerable attention in recent years (e.g.,

Jo

urn

al

308

22

Journal Pre-proof

Kai et al., 2010, 2011; Jiang et al., 2012, 2014; Wang et al., 2013; Zhao et al., 2017). Another

329

direction is to estimate the quanitle function Q(τ ) = F −1 (τ ) = inf {x : F (x) ≥ τ }, where F (x) is

330

the cumulative distribution function of random variable X, and Q(τ ) is a continuous function of τ

331

in interval (0, 1) (e.g., Cheng, 1995; Cai, 2010; Sankaran and Midhu, 2017). The quantile regression

332

without fixing τ may also be considered jointly with the proposed methods. The aforementioned

333

extensions will considerably enlarge the application scope of the proposed methods.

334

Acknowledgements

of

328

The work is supported by China Postdoctoral Science Foundation (No.2017M610156), National

336

Social Science Foundation of China (No.15ZDA009), Research Grant Council of the Hong Kong

337

Special Administration Region (No. 14303017, 14301918), and direct grants from the Chinese

338

University of Hong Kong.

339

References

342 343

344 345

346

Pr e-

341

Alhamzawi, R., Yu, K., Benoit, D. F. (2012). Bayesian adaptive Lasso quantile regression. Statistical Modelling, 12, 279-297.

Alhamzawi, R., Ali, H. T. M. (2018). Bayesian Tobit quantile regression with L1/2 penalty. Communications in Statistics – Simulation and Computation, 47(6), 1739-1750. Alhamzawi, R., Algamal, Z. Y. (2018). Bayesian bridge quantile regression. Communications in Statistics – Simulation and Computation, to appear.

Betancourt, B., Rodriguez, A., Boyd, N. (2017). Bayesian fused Lasso regression for dynamic binary

al

340

p ro

335

networks. Journal of Computational and Graphical Statistics, 26(4), 840-850.

348

Cai, Y. Z. (2010). Multivariate quantile function models. Statistica Sinica, 20, 481-496.

349

Cheng, C. (1995). The Bernstein polynomial estimator of a smooth quantile function. Statistics &

351 352

353 354

Probability Letters, 24(4), 321-330.

Deng, J., Pandey, M. D. (2008). Estimation of the maximum entropy quantile function using fractional probability weighted moments. Structural Safety, 30(4), 307-319. Davino, C., Furno, M., Vistocco, D. (2014). Quantile Regression: Theory and Applications. New York: John Wiley & Sons.

Jo

350

urn

347

355

Devroye,L. (1986). Non-Uniform Random Variate Generation. New York: Springer- Verlag.

356

Fan, J., Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties.

357

Journal of the American Statistical Association, 96(456), 1348-1360.

23

Journal Pre-proof

359

360 361

362 363

364 365

Fu, W. (1998). Penalized regressions: the bridge versus the Lasso. Journal of Computational and Graphical Statistics, 7(3), 397-416. Gefan, D. (2014). Bayesian doubly adaptive elastic-net Lasso for VAR shrinkage. International Journal of Forecasting, 30(1), 1-11. Geraci, M., Bottai, M. (2007). Quantile regression for longitudinal data using the asymmetric Laplace distribution. Biostatistics, 8(1), 140-154. Hunter, D. R., Lange, K. (2000). Quantile regression via an MM algorithm. Journal of Computational

of

358

and Graphical Statistics, 9(1), 60-77.

Huang, Y. (2016). Quantile regression-based Bayesian semiparametric mixed-effects models for longitu-

367

dinal data with non-normal, missing and mismeasured covariate. Journal of Statistical Computation

368

and Simulation, 86(6), 1183-1202.

370

371 372

373 374

Huang, J., Horowitz, J. L., Ma, S. (2008). Asymptotic properties of bridge estimators in sparse highdimensional regression models. Annals of Statistics, 36(2), 587-613.

Jiang, X. J., Jiang, J. C., Song, X. Y. (2012). Oracle model selection for nonlinear models based on weighted composite quantile regression. Statistica Sinica, 22, 1479-1506.

Pr e-

369

p ro

366

Jiang, X. J., Jiang, J, C., Song, X. Y. (2014). Weighted composite quantile regression estimation of DTARCH models. Econometrics Journal, 17, 1-23.

375

Kai, B., Li, R. Z., Zou, H. (2010). Local composite quantile regression smoothing: an efficient and safe

376

alternative to local polynomial regression. Journal of the Royal Statistical Society: Series B, 72,

377

49-69.

379

Kai, B., Li, R. Z., Zou, H. (2011). New efficient estimation and variable selection methods for semiparametric varying-coefficient partially linear models. Annals of Statistics, 39, 305-332.

al

378

Knight, K., Fu, W. (2000). Asymptotics for lasso-type estimators. Annals of Statistics, 28(5), 1356-1378.

381

Koenker, R. (2005). Quantile regression. Cambridge: Cambridge University Press.

382

Koenker, R., Bassett, J. (1978). Regression quantiles. Econometrica, 46(1), 33-50.

383

Koenker, R., Park, B. J (1996). An interior point algorithm for nonlinear quantile regression. Journal of

385 386

387 388

389 390

391

Econometrics, 71(12), 265-283.

Koenker, R., Chernozhukov, V., He, X., Peng, L. (2017). Handbook of Quantile Regression. Portland: Chapman and Hall.

Kozumi, H., Kobayashi, G (2011). Gibbs sampling methods for Bayesian quantile regression. Journal of

Jo

384

urn

380

Statistical Computation and Simulation, 81, 1565-1578. Kobayashi, G., Kozumi, H. (2013). Bayesian analysis of quantile regression for censored dynamic panel data. Computational Statistics, 27, 359-380. Mallick, H., Yi, N. (2018). Bayesian bridge regression. Journal of Applied Statistics, 45(6), 988-1008.

24

Journal Pre-proof

398 399

400 401

402 403

404 405

406 407

408 409

410 411

412 413

414 415

416 417

418 419

420 421

Society: Series B, 76(4), 713-733. Reich, B. J., Fuentes, M., Dunson, D. B. (2011). Bayesian spatial quantile regression. Journal of the American Statistical Association, 106, 6-20.

of

397

Polson, N. G., Scott, J. G., Windle, J. (2014). The Bayesian bridge. Journal of the Royal Statistical

Sankaran, P. G., Midhu, N. N (2017). Nonparametric estimation of mean residual quantile function under right censoring. Journal of Applied Statistics, 44(10), 1856-1874.

p ro

396

103(482), 681-686.

Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society, Series B, 24(3), 267-288.

Tian, Y., Tian, M., Zhu, Q. (2014). Linear quantile regression based on EM algorithm. Communications in Statistics – Theory and Methods, 43(16), 3464-3484.

Tian, Y., Li, E., Tian, M. (2016). Bayesian joint quantile regression for mixed effects models with

Pr e-

395

Park, T. and Casella, G. (2008). The Bayesian Lasso. Journal of the American Statistical Association,

censoring and errors in covariates. Computational Statistics, 31(3), 1-27. Xu, Z., Zhang, H., Wang, Y., Chang, X., Liang, Y. (2010). L1/2 regularization. Science China Information Sciences, 53(6), 1159-1169.

Yu, K., Moyeed, R. A. (2001). Bayesian quantile regression, Statistics and Probability Letters, 54, 437447.

Zhao, K. F., Lian, H. (2015). Bayesian Tobit quantile regression with single-index models. Journal of Statistical Computation and Simulation, 85(6), 1247-1263.

al

394

Applied Statistics, 33(9), 1031-1032.

Zhao, W. H., Lian, H., Song, X. Y. (2017). Composite quantile regression for correlated data. Computational Statistics and Data Analysis, 109, 15-33. Zou, H. (2006). The adaptive LASSO and its oracle properties. Journal of the American Statistical

urn

393

Nadarajah, S. (2006). Acknowledgement of priority: the generalized normal distribution. Journal of

Association, 101, 1418-1429.

Zou, H., Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society, Series B, 2010, 67(2), 301-320. Zou, H., Yuan, M. (2008). Composite quantile regression and the oracle model selection theory. The Annals of Statistics, 36, 1108-1126.

Jo

392

25