Journal Pre-proof Bayesian bridge-randomized penalized quantile regression Yuzhu Tian, Xinyuan Song
PII: DOI: Reference:
S0167-9473(19)30231-2 https://doi.org/10.1016/j.csda.2019.106876 COMSTA 106876
To appear in:
Computational Statistics and Data Analysis
Received date : 1 March 2019 Revised date : 10 October 2019 Accepted date : 12 October 2019 Please cite this article as: Y. Tian and X. Song, Bayesian bridge-randomized penalized quantile regression. Computational Statistics and Data Analysis (2019), doi: https://doi.org/10.1016/j.csda.2019.106876. This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
© 2019 Elsevier B.V. All rights reserved.
*Manuscript Click here to view linked References
Journal Pre-proof
Bayesian Bridge-Randomized Penalized Quantile Regression Yuzhu Tiana , Xinyuan Songb,∗ a School
of Mathematics and Statistics, Henan University of Science and Technology, LuoYang; of Statistics, The Chinese University of Hong Kong, HongKong.
of
b Department
Abstract
p ro
Quantile regression (QR) is an ideal alternative for depicting the conditional quantile functions of a response variable when the conditions of linear regression are unavailable. One advantage of QR in relation to the traditional mean regression is that the QR estimates are more robust against outliers and a large class of error distributions. Regularization methods have been verified to be effective in QR literature for simultaneously conducting parameter estimation and variable selection. This study considers a bridge-
Pr e-
randomized penalty of regression coefficients by incorporating uncertainty penalty into Bayesian bridge QR. The asymmetric Laplace distribution (ALD) and the generalized Gaussian distribution (GGD) priors are imposed on model errors and regression coefficients, respectively, to establish a Bayesian bridge-randomized QR model. In addition, bridge penalty exponent is deemed as a parameter, and a Beta-distributed prior is forced on. By utilizing the normal-exponential and uniform-Gamma mixture representations of the ALD and the GGD, a Bayesian hierarchical model is constructed to conduct the fully Bayesian posterior inference. Gibbs sampler and Metropolis–Hastings algorithms are utilized to draw Markov chain Monte Carlo samples from the full conditional posterior distributions of all unknown parameters. Finally, the proposed procedures are illustrated by simulation studies and applied to a real-data analysis.
1
urn
Regularization method
al
Keywords: Bridge-randomized penalty, Hierarchical model, MCMC methods, Quantile regression,
1. Introduction
As a simple type of regression model, linear regression models have been used extensively in
3
statistical analysis and many applied fields, including but not limited to finance, economics, envi-
4
ronmental science, and society. Regression models are commonly estimated using the traditional
5
least square estimation (LSE) method, which depicts the mean function of the response variable.
6
However, for many response observations with non-normal errors or outliers, the LSE may result in
Jo
2
∗ The corresponding author is Xinyuan Song. Department of Statistics, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong. Phone number: (852)39437929. Fax number: (852)26035188. E-mail address:
[email protected].
Preprint submitted to Computational Statistics & Data Analysis
October 10, 2019
Journal Pre-proof
non-robust parameter estimation. The results are not unexpected to be sensitive to outliers and/or
8
heavier-tailed errors. The estimation efficiency may be naturally reduced in this case. To solve
9
this problem, quantile regression (QR; Koenker and Bassett, 1978) is used as an alternative. With
10
the rapid development in the recent decades, QR has been an attractive statistical tool in modern
11
regression analysis. QR depicts the effects of covariates on the complete conditional distributions
12
of a response variable instead of only the average value (see Koenker, 2005; Davino et al., 2014;
13
Koenker et al., 2017 for recent reviews and further comprehensive studies on QR).
of
7
QR is traditionally solved using approaches, such as interior algorithm (Koenker and Park,
15
1996), Majorize-Minimization (MM) algorithm (Hunter and Lange, 2000), Bayesian approach (Yu
16
and Moyeed, 2001), and the Expectation-Maximization (EM) algorithm (Tian et al., 2014). Among
17
them, the Bayesian QR approach has more advantages compared with commonly used frequentist
18
approaches and has been used extensively in many complex statistical models. Yu and Moyeed
19
(2001) proposed to use the asymmetric Laplace distribution (ALD) as a working likelihood and
20
developed the Bayesian QR method to conduct the posterior inference. The ALD likelihood can
21
lead to simple computation and is easy to incorporate into the Bayesian framework for different
22
types of data. As a result, Bayesian approaches provide a convenient alternative inference tool
23
for QR. Many developments on Bayesian QR analysis are available. For example, Geraci and
24
Bottai (2007) studied QR for longitudinal data; Reich et al. (2011) considered Bayesian spatial
25
QR models; Kobayashi and Kozumi (2013) considered Bayesian QR analysis of censored dynamic
26
panel data; Zhao and Lian (2015) considered Bayesian Tobit QR for single-index models; Huang
27
(2016) studied Bayesian semiparametric quantile mixed effects models with complex data; Tian et
28
al. (2016) studied Bayesian joint QR for mixed effects models with multiple data feature.
al
Pr e-
p ro
14
In regression analysis, many covariates may be brought into models. A parsimonious model that
30
only retains significant covariates is highly desirable. Various variable selection procedures have
31
been proposed to solve this problem by using some information criteria methods, such as Akaike
32
information criterion (AIC), Bayesian information criterion (BIC), and other recent penalized
33
regularization methods. Many regularization methods, including least absolute shrinkage and select
34
operator (LASSO; Tibshirani, 1996), adaptive LASSO (Zou, 2006), smoothly clipped absolute
35
deviation (SCAD; Fan and Li, 2001), elastic-net (Enet; Zou and Hastie, 2005), bridge penalized
36
regression (Fu, 1998; Knight and Fu, 2000; Huang et al., 2008), and L1/2 -norm penalization (Xu et
37
al., 2010), can be used to conduct variable selection and model estimation simultaneously. These
38
approaches impose a penalty on the size of the regression coefficients, which makes it possible to
39
estimate parameters in the presence of a large number of variables and a relatively small number
40
of observations.
Jo
urn
29
2
Journal Pre-proof
With the rapid development of Bayesian statistical computation, many penalized regularization
42
methods can be conveniently conducted in a Bayesian framework. For example, Park and Casella
43
(2008) proposed Bayesian LASSO (BLASSO) regression; Alhamzawi et al. (2012) considered
44
Bayesian adaptive LASSO (BALASSO) QR; Gefan (2014) discussed Bayesian doubly adaptive Enet
45
LASSO for VAR shrinkage; Polson et al. (2014) studied Bayesian bridge regression; Betancourt et
46
al. (2017) studied Bayesian fused Lasso regression for dynamic binary networks; Alhamzawi and
47
Ali (2018) discussed Bayesian tobit QR with L1/2 penalty; Alhamzawi and Algamal (2018) studied
48
Bayesian bridge QR with a fixed penalty index ξ = 1/2; and Mallick and Yi (2018) considered
49
Bayesian bridge regression using L1/2 -norm regularization. However, different penalized methods
50
present different statistical properties and variable selection efficiency. Xu et al. (2010) showed
51
that Lξ (ξ ∈ (0, 1]) regularizer possess many desirable statistical properties, such as oracle, sparsity,
52
and unbiasedness.
p ro
of
41
In the present work, we are interested in Bayesian bridge-randomized QR based on Lξ (ξ ∈ (0, 1])
54
regularizer. For bridge QR regression with Lξ -norm regularization, penalty exponent ξ is deemed
55
as a parameter that can be estimated in Bayesian framework. For example, a Beta family prior
56
is forced on this penalty exponent ξ, and the Bayesian hierarchical QR model is established to
57
conduct statistical inference.
Pr e-
53
The remainder of this paper is organized as follows. Section 2 presents the considered model and
59
working likelihood. In Section 3, we establish a Bayesian hierarchical model of bridge-randomized
60
penalized QR. In Section 4, we consider its adaptive version. Section 5 presents some numerical
61
studies to illustrate the proposed methods. Section 6 applies the proposed methodology to a
62
real-life study. Finally, Section 7 concludes the paper.
63
2. Model and working likelihood
urn
64
al
58
We consider the following linear regression model: yt = xTt β + εt , t = 1, · · · , n,
(1)
65
where xt = (xt1 , · · · , xtp )T is a p-dimensional regression covariates, β = (β1 , · · · , βp )T is the
66
regression parameter vector, and εt is the error term. For any quantile level τ ∈ (0, 1), the τ th conditional quantile of the response yt can be defined as
Jo
67
68
Quantileτ (yt |xt ) = xTt β. According to Koenker and Bassett (1978), QR estimator can be obtained
69
by minimizing the following loss objective function: Q0 (β) =
n ∑ t=1
ρτ (yt − xTt β), 3
(2)
Journal Pre-proof
70
where ρτ (u) = u(τ − I(u < 0)) is the quantile check function.
71
In QR modeling, to select significant covariates for improving prediction accuracy, we can select
72
various penalty functions p(·), such as L1 penalty (LASSO and adaptive LASSO), L2 penalty
73
(ridge regression), Lξ (0 < ξ < 1) penalty (bridge regression), Enet, and SCAD, which result in
74
the penalized QR loss objective function as follows:
75
(3)
of
Q(β) = Q0 (β) + pλ (|β|), where λ > 0 is a penalty parameter, and pλ (|β|) is a penalty function on β.
Among the above penalized regression, the bridge penalized regression is an important regu-
77
larization method that utilizes the Lξ -norm penalty. Generally, the regularization regression with
78
Lξ -norm penalty of QR regression models can be expressed as follows: min β
n ∑ t=1
ρτ (yt −
xTt β)
p ro
76
+λ
p ∑ i=1
|βi |ξ ,
(4)
where λ > 0 is the tuning parameter, and ξ is the penalty exponent of the Lξ -norm penalization
80
function.
81
Pr e-
79
For bridge penalized regression (4), penalization function pλ (|β|) = λ
∑p
i=1
|βi |ξ (ξ ≥ 0) includes
many popular penalties as follows: the best subset selection if ξ = 0, LASSO regression if ξ = 1,
83
ridge regression if ξ = 2, bridge regression if 0 < ξ < 1, and L1/2 regularizer if ξ = 1/2. Among
84
the Lξ -norm regularizers, bridge regularizer has many desirable statistical properties. Huang et
85
al. (2008) studied the asymptotic properties of bridge regression. Xu et al. (2010) showed that
86
bridge regression possesses oracle, sparsity, and unbiasedness. The author also revealed that L1/2
87
penalty is the most sparse and robust among the Lξ (1/2 ≤ ξ ≤ 1) and has similar properties
88
to the Lξ (0 < ξ < 1/2) regularizers. In a Bayesian framework, Polson et al. (2014) recently
89
proposed a Bayesian bridge estimator for regularized regression; Alhamzawi and Algamal (2018)
90
studied Bayesian bridge QR with fixed penalty index ξ = 1/2; and Mallick and Yi (2018) considered
91
Bayesian bridge regression and provided sufficient conditions for strong posterior consistency under
92
a sparsity assumption on a high-dimensional parameter.
urn
al
82
In the aforementioned studies, ξ is generally regarded as a fixed constant, and the Lξ regularizers
94
for 0 < ξ ≤ 1 outperform the Lξ regularizers for 1 < ξ ≤ 2 in simulations. That is, Lξ regularization
95
with a smaller value of ξ results in better solution compared with that with a larger value of ξ.
96
In the literature, the best value for penalty index ξ in Lξ -norm regularization functions is located
97
in the interval (0, 1]. Naturally, ξ can be considered as an unknown parameter and be selected
98
using a data-driven method to obtain the optimal penalty. However, no study has ever focused on
99
selection of the optimal ξ for bridge penalty regression. In a frequentist framework, selecting an
Jo
93
4
Journal Pre-proof
optimal ξ by minimize the complex objective loss function is difficult. Bayesian methods provide
101
an available alternative. In a Bayesian framework, ξ can be treated as a random variable and
102
can be estimated using MCMC algorithm. In this study, we aim to conduct Bayesian statistical
103
inference based on uncertain penalty exponent ξ. We rename the Bayesian bridge QR regression
104
with unknown ξ as the Bayesian bridge-randomized (BRBridge) QR regression by forcing bridge-
105
randomized regularization priors on regression parameters. We consider bridge QR estimators (4)
106
and its adaptive version as follows: min
n ∑ t=1
ρτ (yt − xTt β) +
p ∑ i=1
λi |βi |ξ ,
of
100
(5)
where ξ is a penalty exponent valued in the interval (0, 1], which represents concave bridge penalty.
108
In a Bayesian QR framework, we need to specify a working likelihood for the model error.
109
According to Yu and Moyeed (2001), maximizing the likelihood under ALD errors is equivalent to
110
minimizing the objective loss function (2) of QR. ALD has its probability density function (pdf)
111
as follows:
p ro
107
{ τ (1 − τ ) y−µ } exp − ρτ ( ) , σ σ where µ is the location, σ is the scale, and 0 < τ < 1 is the skewness.
112
113
Pr e-
f (y|µ, σ, τ ) =
The conditional likelihood of QR model (1) can be equivalently defined as n { [ y − xT β ]} ∏ τ (1 − τ ) t t H(β, σ|(xT1 , y1 ), · · · , (xTn , yn )) = exp − ρτ . σ σ t=1
(6)
(7)
114
For the conditional likelihood function (7), we suffer computation difficulty due to the inherent
115
non-differentiability of QR check function. Nevertheless, Kozumi and Kobayashi (2011) provided
where θ1 =
1−2τ τ (1−τ ) , θ2
=
al
117
a scale mixture of Gaussian representation of model (1), as shown as follows: √ yt = xTt β + θ1 υt + θ2 συt · et , t = 1, 2, · · · , n, 2 τ (1−τ ) , υt
(8)
∼ Exp( σ1 ), et ∼ N (0, 1), and υt and et are independent of
urn
116
118
one another. Hence, the resulting conditional distribution of yt is normally distributed with mean
119
xTt β + θ1 υt and variance θ2 συt .
120
121
On the basis of the mixture representation (8), the joint hierarchical likelihood of the complete data {y, x, υ} is given by
n [ ∏
Jo
L(β, σ|y, x, v) =
t=1
{ (y − xT β − θ υ )2 } 1 { 1 }] 1 t 1 t t √ √ exp − exp − υt . 2θ2 συt σ σ 2π θ2 συt
(9)
122
We need to specify bridge penalization prior for regression coefficient to conduct the fully
123
Bayesian analysis of bridge regularized models (4) and (5). The implementation of BRBridge and
124
its adaptive version (BARBridge) will be presented in the following section. 5
Journal Pre-proof
125
3. BRBridge and its adaptive version
126
3.1. Priors
Nadarajah, 2006) priors on the regression coefficients, as shown as follows: π(β|σ, λ, ξ) =
p ∏
i=1 129
130
131
π(βi |σ, λ, ξ), π(βi |σ, λ, ξ) =
{ λ } ( σλ )1/ξ exp − |βi |ξ , 2Γ(1 + 1/ξ) σ
where λ > 0, and ξ ∈ (0, 1].
(10)
of
128
To analyze BRBridge QR model (4), we impose generalized Gaussian distribution (GGD,
By combining GGD priors into the conditional likelihood function (7), we obtain a marginal posterior density of β as follows: π(β|y, x) ∝ L(β, σ|y, x, v)π(β|σ, λ, ξ) n ∏
t=1
exp
{
{ λ∑ } (yt − xTt β − θ1 υt )2 } exp − |βi |ξ . 2θ2 συt σ i=1
−
p
(11)
Pr e-
∝
p ro
127
We observe that the estimator of β obtained in (4) amounts to the posterior mode of marginal
133
density (11). However, the posterior of (11) is analytically intractable because it is difficult to
134
calculate due to bridge penalty. Polson et al. (2014) presented a mixture representation of the
135
GGD with a two-component mixture of Gamma distributions. Mallick and Yi (2018) studied
136
Bayesian bridge regression and proposed a new mixture representation of the GGD, in which the
137
mixing distribution is a particular Gamma distribution. This mixture representation is simpler
138
and more efficient than that of Polson et al. (2014). The mixture representation of the GGD
139
proposed by Mallick and Yi (2018) has the following form:
al
132
141
142
∫
∞
|x|ξ
{ } 1 λ1+1/ξ u1/ξ exp − λu du. 2u1/ξ Γ(1 + 1/ξ)
By using the above mixture representation, we decompose the prior of βi in (10) as ∫ ∞ π(βi |σ, λ, ξ) = π(βi |λ, σ, si , ξ)π(si |ξ)dsi ,
(13)
) ( 1/ξ 1/ξ where π(βi |λ, σ, si , ξ) = Uniform[−si , si ], and π(si |ξ) = Gamma 1 + 1ξ , σλ , i = 1, · · · , p. The priors of other parameters are set as follows:
π(ξ) ∼ Beta(e, f ), π(λ) ∼ Gamma(a, b), π(σ) ∼ IGamma(c, d),
143
(12)
0
Jo
140
urn
{ } λ1/ξ exp − λ|x|ξ = 2Γ(1 + 1/ξ)
where IGamma(·, ·) denotes the inverse Gamma distribution.
6
(14)
Journal Pre-proof
144
145
146
3.2. Posterior inference On the basis of the hierarchical likelihood (9) and the preceding priors, we formulate the hierarchical model as follows: y n×1 |X, β, σ ∼ L(β, σ|y, x, v), β|S, λ, σ, ξ ∼
i=1
1/ξ
1/ξ
Uniform[−si , si ], S = {si , i = 1, · · · , p},
( 1 λ) Gamma 1 + , , ξ σ i=1 p ∏
of
S|λ, σ, ξ ∼
p ∏
p ro
λ ∼ Gamma(a, b),
(15)
σ ∼ IGamma(c, d), ξ ∼ Beta(e, f ).
Let θ = {β, S, λ, σ, ξ} be a vector of unknown parameter. The Bayesian estimate of θ can be obtained through the mean or mode of the posterior samples drawn from p(θ|y, x, v). However,
Pr e-
p(θ|y, x, v) is indirectly tractable because v is also random. Thus, we use the data augmentation and Gibbs sampler to iteratively sample from p(θ|y, x, v) and p(v|y, x, θ). Given that θ includes multiple components, p(θ|y, x, v) is still complicated. The Gibbs sampler is used further to simulate each unknown of θ from its full conditional distribution. Based on the hierarchical model and prior specifications, the full conditional distributions can be obtained as follows: π(σ|y, x, υ, β, λ, ξ) ∼ IGamma(δ, η),
al
p ∑ ( ) π(λ|y, x, υ, β, σ, ξ) ∼ Gamma p(1 + 1/ξ) + a, b + si /σ .
urn
( 1 η 2 θ2 + 2θ ) 2 π(υt |y, x, β, σ, λ, ξ) ∼ GIG , t , 1 , 2 θ2 σ θ2 σ p ∏ 1/ξ ∗ ∗ π(β|y, x, υ, σ, λ) ∼ N (β , B ) I(|βi | ≤ si ), π(S|y, X, υ, β, σ, λ) ∼
p ∏
i=1
149
150
3n 2
Exp(λ/σ)I{si ≥ |βi |ξ },
+ p(1 + 1/ξ) + c, η =
Jo
148
where δ = ηt = y t −
xTt β,
(16)
i=1
π(ξ|β, σ, λ, y, x) ∝ ξ e−1 (1 − ξ)f −1 147
i=1
∑n
t=1
(
p p ∏ (λ/σ) ξ I{0 ≤ ξ ≤ 1} I{|βi |ξ ≤ si }, [Γ(1 + 1/ξ)]p i=1
e2t 2θ2 υt
) ∑p + υt + λ i=1 si + d, et = yt − xTt β − θ1 υt ;
GIG(λ, χ, ψ) denotes the generalized inverse Gaussian distribution with index λ (∑ ) (∑ )−1 xt xT n n xt y˜t ∗ t and scale parameters χ > 0 and ψ > 0; β ∗ = B ∗ , B = , and t=1 θ2 συt t=1 θ2 συt y˜t = yt − θ1 υt .
7
Journal Pre-proof
151
In the full conditional distributions of (16), π(σ|·), π(λ|·), and π(vt |·) are familiar distributions,
152
and sampling from them is straightforward and fast. On the contrary, π(β|·) is a multivariate
153
truncated normal distribution. Sampling from π(β|·) can be implemented by iteratively sampling
154
from multiple conditional univariate truncated normal distributions (Devroye, 1986). π(si |·) is a
155
left-truncated exponential distribution, sampling from which is accomplished by using the inverse
156
transformation method with two substeps: (a) generate s∗i ∼ Exp(λ/σ) and (b) let si = s∗i +
|βi |ξ , i = 1, · · · , p. π(ξ|·) is a nonstandard and complex distribution. A random-walk Metropolis
158
algorithm (Polson et al., 2014) is employed to sample from π(ξ|·).
159
3.3. Adaptive version of BRBridge QR
p ro
of
157
160
In this section, we consider BARBridge QR, an adaptive version of BRBridge QR. Unlike
161
BRBridge QR that imposes a single penalty parameter on all regression coefficients, BARBridge
162
introduces a different penalty to each regression coefficient. Specifically, the prior distributions of
163
the regression coefficients are defined as follows: p ∏
i=1
{ λ } ( λσi )1/ξ i exp − |βi |ξ , 2Γ(1 + 1/ξ) σ
Pr e-
π(β|σ, Λ, ξ) =
π(βi |σ, λi , ξ), π(βi |σ, λi , ξ) =
(17)
164
where λi > 0 is the penalty parameter imposed on βi , Λ = (λ1 , · · · , λp ), and ξ ∈ (0, 1]. Further-
165
more, the prior distributions of S and Λ are assigned as follows: S|Λ, σ, ξ ∼
( 1 λi ) Gamma 1 + , , ξ σ i=1 p ∏
Λ∼
p ∏
Gamma(ai , bi ),
(18)
i=1
where S, σ, and ξ are the same as those in (15), and ai and bi are hyperparameters. The full
167
conditional distributions can be derived in a similar manner as in Section 3.2. The implementation
168
of the Gibbs sampler and MCMC algorithm is likewise similar. The details are omitted.
al
166
Compared with BRBridge QR that equally penalizes all regression coefficients, BARBridge
170
allows data-driving penalties, which automatically introduce high penalties to unimportant coeffi-
171
cients and low penalties to important ones, thereby enabling highly flexible and efficient variable
172
selection and parameter estimation, especially in sparse cases.
173
4. Numerical studies
174
4.1. Simulation 1
Jo
urn
169
175
In this section, we conduct simulation studies to assess the finite sample performance of the
176
proposed Bayesian estimation procedures. We generate 100 datasets from Equation (1) in the
177
following cases with sample size of n = 100 and 200 as follows: 8
Journal Pre-proof
10000
15000
20000
0
0
5000
10000
15000
20000
10000
0.0 1.0 2.0
0.0 1.0 2.0
5000
15000
20000
1.0 0.0
1.0
5000
10000
20000
5000
10000
15000
20000
5000
10000
15000
20000
15000
20000
xi
0.0 0
15000
beta_6
0
sigma
10000
beta_4
0
beta_5
0
5000
0.0 1.0 2.0
0.0 1.0 2.0
beta_3
of
5000
p ro
0
beta_2
0.0 1.0 2.0
0.0 1.0 2.0
beta_1
15000
20000
0
5000
10000
Pr e-
Figure 1: MCMC chains starting from different initial values under Model 1: n = 100, εt ∼ t3 , and τ = 0.5.
Model 1 (Dense case): β = (0.85, 0.85, 0.85, 0.85, 0.85, 0.85)T ;
179
Model 2 (Sparse case): β = (0.85, 0, 0.85, 0, 0, 0)T .
180
Predictors xt = (xt1 , · · · , xtp )T , t = 1, · · · , n, where xt1 , · · · , xtp are generated independently
181
from a standard normal distribution. Meanwhile, standard normal distribution (N (0, 1)) and
182
heavy-tailed t distribution with 3 degree of freedom (t3 ) are considered for model errors. Three
183
quantile levels, 0.25, 0.50, and 0.75, are considered in all simulations. The hyperparameters of
184
the Gamma, inverse Gamma, and Beta priors discussed in Section 3 are set as follows: (Prior 1)
185
a = b = c = d = 0.1 and e = f = 1.
urn
al
178
186
To guide the MCMC convergence, we first conduct a few test runs using different initial values
187
in the setting of Model 1: n = 100, εt ∼ t3 , and τ = 0.5. We consider three different sets
188
of initial values, namely, initial 1: {β = (−3, −3, −3, −3, −3, −3)T , σ = 1, ξ = 1/3}; initial 2:
{β = (1, 1, 1, 1, 1, 1)T , σ = 5, ξ = 1/2}; and initial 3: {β = (3, 3, 3, 3, 3, 3)T , σ = 10, ξ = 2/3}.
190
Figure 1 presents the three MCMC chains starting from these initial values. The rapid mixing of
191
the MCMC chains indicates a quick convergence of the MCMC algorithm. Figures 2 and 3 present
192
the trace and density plots of 20,000 posterior samples of one MCMC chain of the regression and
193
penalty parameters, which reconfirm that the MCMC chains of all parameters rapidly converge to
194
their stationary distributions. We also depict the autocorrelation function (ACF) plot to check the
Jo
189
9
Journal Pre-proof
0
5000
0.0 1.0
5000
15000
0.0 1.0
beta_6
Trace plot
5000
15000
Density plot
Density plot
1.2
1.4
0.4
1.0
1.2
0.6
0.4
0.6
0.8
1.0
1.0
1.2
0.0 2.0
beta_6
1.2
0.8
Density plot
1.2
0.6
Pr e-
1.0
0.8
0.0 2.0
beta_5 0.8
Density plot
Density plot
0.0 2.0
0.6
0.6
15000
p ro
1.0
5000 Index
0.0 2.0
0.0 2.0
0.8
0
beta_3
Index
of
0
0 2 4
15000
0.0 1.0
Trace plot
beta_5
Trace plot
Density plot
beta_4
0
Index
Index
0.6
0.4
15000 Index
5000
0.4
beta_3
15000 Index
beta_2
0
Trace plot
0.0 1.0
beta_2 5000
0.0 1.0
beta_4
0
beta_1
Trace plot
0.0 1.0
beta_1
Trace plot
0.8
1.0
1.2
1.4
Figure 2: Trace plot of 20,000 Gibbs samples and density plots of 1000 posterior samples of regression coefficients under Model 1: n = 100, εt ∼ t3 , and τ = 0.5.
3
4
Density plot
1
2
tau=0.25
1.0 0.5
0
0.0
tau=0.25
1.5
Trace plot
0
5000
10000
15000
20000
0.4
0.6
0.8
1.0
al
Index
0
5000
10000
15000
2
3
4
Density plot
0
1
tau=0.50
urn
1.0 0.5 0.0
tau=0.50
1.5
Trace plot
20000
0.2
0.4
0.6
0.8
1.0
Index
5000
10000
4 3 2 1 0
Jo 0
Density plot
tau=0.75
1.0 0.5 0.0
tau=0.75
1.5
Trace plot
15000
20000
0.2
0.4
0.6
0.8
1.0
Index
Figure 3: Trace plot of 20,000 Gibbs samples and density plots of 1000 posterior samples of penalty index ξ under Model 1: n = 100, εt ∼ t3 , and τ = 0.25, 0.50, and 0.75.
10
Journal Pre-proof beta_2
0.0 0.8
0.0 0.8
beta_1
0
10
20
30
40
0
10
20
30
40
30
40
beta_4
0.0 0.8
0.0 0.8
beta_3
10
20
30
40
0
10
20
0.0 0.8
beta_6
0.0 0.8
beta_5
0
10
20
30
40
0
10
10
20
30
30
40
p ro
0.0 0.8 0
20
40
xi
0.0 0.8
sigma
of
0
0
10
20
30
40
Figure 4: ACF plot under Model 1: n = 100, εt ∼ t3 , and τ = 0.5. The horizontal axis denotes the number
Pr e-
of MCMC iterations.
autocorrelation between posterior samples in the abovementioned setting. Figure 4 shows that the
196
autocorrelation between posterior samples rapidly declines to zero. Hence, we use all samples after
197
burn-ins instead of taking every qth (q > 1) iterated samples in a long chain in the subsequent
198
analyses. To be conservative, for each simulation we run the Gibbs sampling algorithm 20,000
199
iterations. Then, we discard the first 5,000 iterations as burn-ins and perform Bayesian inference
200
using the remaining 15,000 posterior samples. In the simulations, the MH acceptance rate for
201
penalty exponent ξ is approximately 15% − 30% for all cases.
al
195
On the basis of 100 replications, the averaged estimation biases (Bias), root mean square errors
203
(RMSE), 95% equal-tailed confidence lower limits (CL) and upper limits (UL) of the regression
204
coefficients obtained by using BRBridge and BARBridge QR over the three quantile levels are
205
presented in Tables 1–2 (Model 1) and 3–4 (Model 2). Meanwhile, the averaged estimation values,
206
standard error estimates, and the 95% equal-tailed credible intervals of ξ are presented in the last
207
columns of these tables (written in bold). Tables 1 and 2 show that BRBridge and BARBridge
208
QR procedures can estimate parameters accurately for two error distributions under sample sizes
209
of n = 100 and n = 200. In the case of τ = 0.5, BRBridge and BARBridge QR yield superior
210
estimation results with smaller biases and RMSE for most of parameters under N (0, 1) error than
211
under heavy-tailed t3 error over the three quantiles. The proposed procedures can provide nearly
212
unbiased estimation results for all regression coefficients. As the sample size n increases, the
213
estimation results improve with smaller RMSE. For dense coefficients (Model 1), the proposed
Jo
urn
202
11
Journal Pre-proof
Table 1: BRBridge QR results of Model 1.
Error
τ
Parameter
β1 = 0.85
β2 = 0.85
β3 = 0.85
β4 = 0.85
β5 = 0.85
β6 = 0.85
ξ
n = 100 −0.028 0.146 0.529 1.091
0.010 0.126 0.616 1.096
−0.017 0.141 0.563 1.097
−0.036 0.120 0.622 1.028
−0.028 0.137 0.595 1.093
−0.016 0.140 0.565 1.102
0.802 0.004 0.795 0.811
0.50
Bias RMSE 95% CL 95% CU
−0.007 0.116 0.647 1.071
−0.029 0.117 0.613 1.033
−0.015 0.111 0.620 1.042
−0.014 0.121 0.597 1.050
−0.021 0.111 0.637 1.020
−0.002 0.104 0.637 1.027
0.803 0.005 0.794 0.811
0.75
Bias RMSE 95% CL 95% CU
0.003 0.133 0.617 1.052
0.006 0.131 0.594 1.103
0.006 0.128 0.642 1.100
−0.014 0.132 0.584 1.093
−0.015 0.110 0.581 1.023
−0.046 0.147 0.544 1.035
0.803 0.005 0.794 0.811
0.25
Bias RMSE 95% CL 95% CU
−0.017 0.162 0.515 1.114
−0.035 0.158 0.485 1.139
−0.036 0.174 0.506 1.126
−0.032 0.155 0.548 1.090
−0.005 0.175 0.517 1.153
−0.030 0.158 0.528 1.074
0.802 0.006 0.789 0.812
0.50
Bias RMSE 95% CL 95% CU
−0.027 0.120 0.614 1.048
−0.028 0.129 0.584 1.039
−0.037 0.139 0.538 1.084
−0.017 0.122 0.637 1.076
−0.033 0.129 0.604 1.069
−0.031 0.134 0.544 1.069
0.803 0.005 0.791 0.812
0.75
Bias RMSE 95% CL 95% CU
0.004 0.157 0.563 1.126
−0.019 0.190 0.472 1.149
−0.038 0.171 0.506 1.103
−0.014 0.153 0.521 1.150
−0.054 0.165 0.516 1.117
0.027 0.149 0.551 1.056
0.802 0.006 0.789 0.811
of
Bias RMSE 95% CL 95% CU
p ro
t3
0.25
0.25
Bias RMSE 95% CL 95% CU
−0.013 0.086 0.699 1.029
Pr e-
N (0, 1)
0.006 0.098 0.664 1.055
−0.014 0.095 0.649 1.031
−0.010 0.089 0.664 1.027
−0.016 0.097 0.672 0.991
−0.011 0.094 0.674 1.016
0.804 0.004 0.796 0.811
0.50
Bias RMSE 95% CL 95% CU
−0.005 0.079 0.732 1.025
−0.028 0.088 0.640 0.969
−0.004 0.086 0.697 1.009
−0.022 0.079 0.674 0.963
−0.015 0.091 0.665 1.002
0.000 0.083 0.688 1.009
0.805 0.005 0.795 0.814
0.75
Bias RMSE 95% CL 95% CU
0.010 0.094 0.700 1.038
−0.018 0.094 0.676 1.008
−0.005 0.091 0.661 1.023
−0.008 0.084 0.667 0.997
0.002 0.081 0.703 1.003
0.000 0.101 0.664 1.042
0.805 0.005 0.795 0.814
Bias RMSE 95% CL 95% CU
−0.017 0.124 0.585 1.041
−0.031 0.124 0.585 1.043
−0.008 0.109 0.645 1.071
−0.008 0.115 0.626 1.056
−0.018 0.117 0.630 1.051
−0.025 0.103 0.618 0.979
0.805 0.005 0.795 0.813
Bias RMSE 95% CL 95% CU
−0.015 0.089 0.678 0.985
−0.020 0.101 0.650 1.003
−0.011 0.089 0.669 0.994
−0.018 0.099 0.663 1.055
−0.008 0.082 0.667 0.995
−0.008 0.097 0.649 1.030
0.805 0.005 0.797 0.813
Bias RMSE 95% CL 95% CU
−0.014 0.110 0.659 1.055
−0.015 0.116 0.614 1.030
−0.025 0.111 0.598 1.008
−0.011 0.113 0.619 1.040
0.011 0.126 0.588 1.095
−0.020 0.111 0.625 1.033
0.804 0.005 0.795 0.813
t3
0.25
0.50
Jo
0.75
urn
N (0, 1)
al
n = 200
Note: The last column presents the averaged Bayesian estimates, standard error estimates, and the 95% equal-tailed credible intervals of ξ on the basis of 100 replications.
12
Journal Pre-proof
Table 2: BARBridge QR results of Model 1
Error
τ
Parameter
β1 = 0.85
β2 = 0.85
β3 = 0.85
β4 = 0.85
β5 = 0.85
β6 = 0.85
ξ
n = 100 −0.021 0.149 0.560 1.122
−0.009 0.144 0.582 1.091
0.000 0.142 0.581 1.147
−0.004 0.142 0.567 1.120
−0.015 0.123 0.609 1.040
−0.026 0.137 0.542 1.069
0.866 0.004 0.859 0.872
0.50
Bias RMSE 95% CL 95% CU
−0.006 0.111 0.652 1.048
−0.013 0.134 0.571 1.085
−0.009 0.118 0.620 1.061
−0.019 0.130 0.624 1.087
−0.028 0.116 0.604 1.003
−0.021 0.131 0.580 1.096
0.868 0.004 0.861 0.874
0.75
Bias RMSE 95% CL 95% CU
−0.015 0.129 0.555 1.061
−0.007 0.132 0.603 1.105
−0.024 0.132 0.580 1.073
−0.026 0.130 0.556 1.020
0.004 0.121 0.636 1.080
−0.015 0.120 0.623 1.091
0.867 0.004 0.860 0.873
0.25
Bias RMSE 95% CL 95% CU
−0.020 0.174 0.486 1.154
−0.033 0.176 0.515 1.144
−0.023 0.159 0.500 1.116
−0.025 0.153 0.545 1.092
−0.025 0.167 0.499 1.113
0.012 0.189 0.472 1.183
0.868 0.004 0.861 0.875
0.50
Bias RMSE 95% CL 95% CU
−0.023 0.121 0.595 1.053
−0.036 0.150 0.552 1.153
−0.022 0.144 0.531 1.143
−0.005 0.137 0.600 1.074
−0.028 0.136 0.552 1.048
−0.007 0.120 0.606 1.071
0.870 0.003 0.865 0.877
0.75
Bias RMSE 95% CL 95% CU
−0.021 0.184 0.466 1.165
−0.020 0.187 0.470 1.165
−0.055 0.177 0.532 1.117
−0.001 0.160 0.547 1.124
−0.027 0.178 0.488 1.168
−0.071 0.187 0.435 1.097
0.868 0.004 0.860 0.876
of
Bias RMSE 95% CL 95% CU
p ro
t3
0.25
0.25
Bias RMSE 95% CL 95% CU
−0.012 0.094 0.659 0.999
Pr e-
N (0, 1)
−0.008 0.096 0.673 1.004
0.006 0.097 0.676 1.023
−0.012 0.100 0.657 1.003
−0.014 0.097 0.658 1.015
−0.011 0.082 0.678 1.010
0.866 0.003 0.861 0.873
0.50
Bias RMSE 95% CL 95% CU
0.008 0.092 0.671 1.017
−0.009 0.088 0.668 1.021
0.007 0.080 0.707 1.006
−0.003 0.086 0.668 0.993
−0.010 0.088 0.676 0.988
−0.021 0.092 0.663 0.967
0.867 0.004 0.861 0.875
0.75
Bias RMSE 95% CL 95% CU
−0.009 0.087 0.679 1.003
−0.005 0.092 0.675 1.025
−0.003 0.083 0.683 0.987
−0.015 0.092 0.675 1.055
−0.007 0.086 0.675 1.006
−0.008 0.100 0.675 1.026
0.867 0.004 0.861 0.874
Bias RMSE 95% CL 95% CU
−0.013 0.118 0.620 1.058
0.004 0.104 0.635 1.078
−0.003 0.113 0.636 1.031
−0.021 0.119 0.620 1.094
−0.011 0.110 0.613 1.072
−0.030 0.119 0.596 1.020
0.869 0.003 0.863 0.874
Bias RMSE 95% CL 95% CU
−0.015 0.086 0.665 0.974
−0.002 0.089 0.673 1.004
−0.005 0.085 0.700 0.987
−0.016 0.091 0.659 0.997
−0.012 0.098 0.680 1.028
−0.006 0.089 0.677 1.030
0.870 0.004 0.863 0.877
Bias RMSE 95% CL 95% CU
−0.003 0.126 0.616 1.098
−0.001 0.121 0.649 1.121
−0.011 0.108 0.624 1.025
0.001 0.108 0.622 1.033
0.000 0.122 0.594 1.069
−0.002 0.121 0.636 1.078
0.868 0.003 0.863 0.876
t3
0.25
0.50
Jo
0.75
urn
N (0, 1)
al
n = 200
Note: The last column presents the averaged Bayesian estimates, standard error estimates, and the 95% equal-tailed credible intervals of ξ on the basis of 100 replications.
13
Journal Pre-proof
Table 3: BRBridge QR results of Model 2 Error
τ
Parameter
β1 = 0.85
β2 = 0
β3 = 0.85
β4 = 0
β5 = 0
β6 = 0
ξ
n = 100 −0.035 0.144 0.531 1.057
0.012 0.088 −0.142 0.185
−0.021 0.141 0.542 1.080
0.005 0.097 −0.201 0.210
−0.008 0.101 −0.218 0.182
0.012 0.103 −0.202 0.198
0.616 0.060 0.503 0.705
0.50
Bias RMSE 95% CL 95% CU
−0.029 0.124 0.581 1.086
0.001 0.086 −0.174 0.173
−0.034 0.129 0.544 1.012
−0.003 0.072 −0.146 0.130
0.005 0.084 −0.159 0.203
0.004 0.090 −0.186 0.181
0.589 0.086 0.407 0.690
0.75
Bias RMSE 95% CL 95% CU
−0.033 0.137 0.562 1.070
−0.009 0.101 −0.236 0.237
−0.044 0.123 0.570 1.023
−0.004 0.098 −0.202 0.210
−0.012 0.077 −0.176 0.121
−0.006 0.088 −0.179 0.195
0.602 0.066 0.473 0.705
0.25
Bias RMSE 95% CL 95% CU
−0.056 0.175 0.416 1.103
0.003 0.100 −0.188 0.213
−0.036 0.167 0.474 1.122
0.000 0.104 −0.225 0.186
0.017 0.115 −0.162 0.247
−0.034 0.133 −0.323 0.187
0.640 0.064 0.505 0.728
0.50
Bias RMSE 95% CL 95% CU
−0.045 0.143 0.561 1.114
−0.007 0.083 −0.178 0.138
−0.070 0.154 0.485 1.046
−0.008 0.097 −0.225 0.154
0.013 0.098 −0.170 0.247
−0.008 0.097 −0.252 0.173
0.627 0.047 0.516 0.701
0.75
Bias RMSE 95% CL 95% CU
−0.040 0.173 0.449 1.134
0.006 0.133 −0.255 0.290
−0.024 0.154 0.536 1.142
0.001 0.101 −0.206 0.160
0.003 0.111 −0.269 0.243
−0.005 0.102 −0.198 0.207
0.634 0.063 0.498 0.725
of
Bias RMSE 95% CL 95% CU
p ro
t3
0.25
Pr e-
N (0, 1)
n = 200
−0.021 0.095 0.677 1.001
−0.008 0.070 −0.172 0.126
−0.029 0.100 0.642 0.991
0.008 0.076 −0.145 0.174
0.006 0.073 −0.121 0.171
0.018 0.060 −0.073 0.146
0.553 0.091 0.331 0.663
0.50
Bias RMSE 95% CL 95% CU
0.002 0.090 0.661 1.015
−0.001 0.063 −0.126 0.121
−0.015 0.081 0.706 1.010
0.001 0.059 −0.129 0.103
−0.004 0.068 −0.120 0.124
0.006 0.067 −0.130 0.144
0.558 0.068 0.388 0.668
0.75
Bias RMSE 95% CL 95% CU
−0.021 0.084 0.673 0.974
0.006 0.068 −0.137 0.138
−0.008 0.085 0.702 1.000
0.001 0.076 −0.163 0.154
0.011 0.082 −0.141 0.180
−0.001 0.071 0.145 0.180
0.551 0.095 0.362 0.679
al
Bias RMSE 95% CL 95% CU
0.25
Bias RMSE 95% CL 95% CU
−0.010 0.119 0.616 1.100
−0.012 0.078 −0.170 0.117
−0.011 0.118 0.620 1.037
0.002 0.088 −0.180 0.148
−0.003 0.089 −0.171 0.190
0.005 0.092 −0.172 0.191
0.603 0.057 0.497 0.704
0.50
Bias RMSE 95% CL 95% CU
−0.020 0.092 0.680 1.019
0.004 0.057 −0.119 0.105
−0.006 0.089 0.688 1.033
0.006 0.056 −0.111 0.119
−0.003 0.068 −0.133 0.134
−0.015 0.074 −0.185 0.108
0.567 0.074 0.392 0.672
0.75
Bias RMSE 95% CL 95% CU
−0.045 0.126 0.619 1.025
−0.003 0.079 −0.154 0.156
−0.018 0.137 0.602 1.103
−0.004 0.088 −0.176 0.211
−0.007 0.086 −0.203 0.140
0.006 0.082 −0.160 0.154
0.600 0.070 0.483 0.697
Jo
t3
0.25
urn
N (0, 1)
Note: The last column presents the averaged Bayesian estimates, standard error estimates, and the 95% equal-tailed credible intervals of ξ on the basis of 100 replications.
14
Journal Pre-proof
Table 4: BARBridge QR results of Model 2 Error
τ
Parameter
β1 = 0.85
β2 = 0
β3 = 0.85
β4 = 0
β5 = 0
β6 = 0
ξ
n = 100 −0.012 0.145 0.583 1.132
−0.003 0.084 −0.208 0.145
−0.012 0.126 0.596 1.043
−0.006 0.078 −0.162 0.170
0.50
Bias RMSE 95% CL 95% CU
−0.034 0.124 0.589 1.064
0.001 0.083 −0.136 0.168
−0.022 0.120 0.625 1.058
−0.001 0.070 −0.151 0.147
0.75
Bias RMSE 95% CL 95% CU
0.011 0.134 0.595 1.096
0.004 0.083 −0.154 0.161
−0.033 0.153 0.516 1.095
0.005 0.076 −0.171 0.172
0.25
Bias RMSE 95% CL 95% CU
−0.032 0.174 0.502 1.167
−0.017 0.098 −0.233 0.138
−0.028 0.156 0.531 1.144
0.50
Bias RMSE 95% CL 95% CU
0.001 0.084 −0.200 0.156
0.75
Bias RMSE 95% CL 95% CU
−0.006 0.089 −0.193 0.154
−0.022 0.153 0.532 1.089 −0.037 0.178 0.477 1.165
0.000 0.081 −0.205 0.176
−0.009 0.101 −0.219 0.191
0.786 0.034 0.729 0.828
0.003 0.077 −0.143 0.145
−0.010 0.081 −0.161 0.166
0.787 0.022 0.748 0.821
0.009 0.090 −0.183 0.201
−0.001 0.089 −0.198 0.191
0.785 0.034 0.729 0.834
−0.005 0.107 −0.259 0.183
−0.001 0.096 −0.216 0.224
0.016 0.090 −0.164 0.240
0.796 0.020 0.759 0.832
−0.011 0.131 0.570 1.076
−0.002 0.087 −0.184 0.182
0.010 0.073 −0.132 0.155
−0.011 0.083 −0.206 0.162
0.797 0.019 0.750 0.832
−0.037 0.178 0.483 1.090
0.013 0.102 −0.189 0.263
0.018 0.106 −0.172 0.262
0.000 0.102 −0.210 0.190
0.800 0.022 0.752 0.840
of
Bias RMSE 95% CL 95% CU
p ro
t3
0.25
Pr e-
N (0, 1)
n = 200
−0.011 0.094 0.656 1.006
0.006 0.051 −0.109 0.121
0.013 0.098 0.684 1.030
−0.003 0.068 −0.139 0.141
0.009 0.056 −0.100 0.143
−0.009 0.048 −0.100 0.070
0.767 0.032 0.707 0.817
0.50
Bias RMSE 95% CL 95% CU
0.004 0.077 0.726 1.010
0.003 0.054 −0.100 0.116
−0.015 0.093 0.671 0.987
0.000 0.052 −0.100 0.097
0.001 0.048 −0.107 0.089
−0.003 0.053 −0.102 0.096
0.771 0.028 0.712 0.819
0.75
Bias RMSE 95% CL 95% CU
−0.012 0.086 0.684 1.009
−0.001 0.056 −0.123 0.137
0.003 0.085 0.696 1.020
0.000 0.059 −0.115 0.132
0.007 0.064 −0.106 0.161
0.002 0.061 −0.115 0.131
0.768 0.037 0.686 0.820
al
Bias RMSE 95% CL 95% CU
0.25
Bias RMSE 95% CL 95% CU
−0.004 0.113 0.629 1.040
−0.008 0.064 −0.131 0.119
0.006 0.103 0.679 1.060
0.004 0.079 −0.168 0.217
−0.004 0.068 −0.157 0.131
−0.004 0.066 −0.128 0.124
0.784 0.025 0.730 0.823
0.50
Bias RMSE 95% CL 95% CU
−0.008 0.099 0.648 1.009
−0.004 0.058 −0.121 0.131
−0.017 0.105 0.649 1.035
0.002 0.063 −0.096 0.146
0.000 0.068 −0.135 0.154
−0.011 0.070 −0.201 0.104
0.781 0.023 0.738 0.816
0.75
Bias RMSE 95% CL 95% CU
0.004 0.115 0.606 1.062
−0.012 0.068 −0.153 0.121
−0.003 0.112 0.655 1.075
0.006 0.068 −0.149 0.133
−0.002 0.072 −0.141 0.143
0.016 0.076 −0.118 0.210
0.784 0.023 0.741 0.818
Jo
t3
0.25
urn
N (0, 1)
Note: The last column presents the averaged Bayesian estimates, standard error estimates, and the 95% equal-tailed credible intervals of ξ on the basis of 100 replications.
15
Journal Pre-proof
214
methods tend to select a large penalty index ξ, and the estimated values of ξ range from 0.8 to
215
0.9 across different quantiles and error distributions. Based on the density plot of the posterior
216
samples of ξ in Figure 3, the posterior distribution of ξ tends to be left-skewed.
217
Model errors can be evaluated by the following criteria based on the 100 replications: n ( ) 1∑ T ˆ (|xi (β − β true )|)], MMSE = Mean (βˆ − β true )T (βˆ − β true ) . n i=1
(19)
of
MMAD = Mean[
The MMADs and MMSEs of BRBridge and BARBridge regularization methods over the three
219
quantiles under Models 1 and 2 are presented in Table 5. For comparison, the MMADs and MM-
220
SEs of six existing methods, namely, Bayesian ridge (BRidge), BLASSO, Bayesian L1/2 (BL1/2 ),
221
Bayesian adaptive ridge (BARidge), BALASSO, and Bayesian adaptive L1/2 (BAL1/2 ) QR, are
222
also provided. Table 5 shows that BRBridge and BARBridge QR are not consistently optimal
223
compared with the other six classes of regularization methods for dense coefficients (Model 1) at
224
the 0.25th quantile. However, at the median and the 0.75th quantile, BRBridge and BARBridge
225
QR show certain advantages with small MMAD or MMSE for many cases.
Pr e-
p ro
218
For sparse Model 2, we aim to examine whether BRBridge and BARBridge QR can specify zero
227
coefficients exactly. Likewise, the convergence, trace, and density plots of posterior samples (not
228
reported) indicate a rapid convergence of the MCMC algorithm. In particular, for zero coefficients
229
β2 , β4 , β5 , and β6 , the MCMC samples of BRBridge and BARBridge QR can converge to zeros
230
stationarily. Tables 3 and 4 show that BRBridge and BARBridge QR procedures can specify
231
significant covariates and zero-valued coefficients accurately for different error distributions over
232
the three quantiles. Similar to the conclusion of Model 1, the estimation results under the normal
233
error outperform those under the t3 error over the three quantiles. In addition, comparison of the
234
estimation results of BRBridge and BARBridge QR indicates that the latter can estimate zero
235
coefficients more efficiently with smaller RMSE and shorter confidence intervals of covering zeros.
236
Hence, the proposed procedures can yield unbiased and sparse estimation results for all regression
237
coefficients. For sparse coefficients, the posterior density plot of penalty exponent ξ (not reported)
238
shows that its posterior distribution tends to be left-skewed on interval (0,1) and similar across
239
quantiles. Tables 3 and 4 show that the estimated values of ξ range from 0.6 to 0.8 across different
240
quantiles and error distributions, thereby indicating that the proposed methods select smaller ξ for
241
sparse coefficients than for dense coefficients. Moreover, the optimal value of penalty exponent ξ
242
for Bayesian bridge regression ranges mostly from 0.5 to 1. This phenomenon is consistent with the
243
conclusion of Xu (2010). Our proposed BRBridge and BARBridge QR can be regarded as a balance
244
between L1/2 - and LASSO-regularized regression. Based on Table 5, BRBridge and BARBridge
245
QR perform superior to other methods due to smaller MMAD or MMSE for a majority of cases. For
Jo
urn
al
226
16
Journal Pre-proof
246
sparse coefficients, the results of BRBridge and BARBridge QR are close to the results of BL1/2
247
and BAL1/2 QR. They both outperform BLASSO and BALASSO QR, respectively. Moreover,
248
when other conditions are the same, BARBridge QR outperforms BRBridge QR due to smaller
249
MMAD or MMSE for all cases. The simulations are conducted using a laptop [Dell Intel(R) Core(TM) i7-6600U CPU] and
251
statistical software R3.5.2. The R code is freely available upon request. In the setting of Model
252
1: n = 100, εt ∼ t3 , and τ = 0.5, the computing times of BRBridge and BARBridge QR for
253
completing one replication are 1.718 and 1.850 minutes, respectively.
254
4.2. Simulation 2
p ro
of
250
255
In this section, we conduct additional simulations to assess the performance of the proposed
256
methods under different prior inputs and model settings. Unless otherwise noted, we focus on
257
n = 100, εt ∼ t3 , and τ = 0.5 in the subsequent analyses.
We first conduct a sensitivity analysis to check whether different prior inputs influence the
259
results of posterior samples. The hyperparameters in Prior 1 are disturbed as follows: (Prior 2)
260
a = 1, b = 2, c = 1, d = 10, e = 1/2, and f = 2. The results obtained under Prior 2 are similar to
261
those reported in Tables 1–4 and not reported.
Pr e-
258
262
As suggested by an anonymous reviewer, we consider additional model settings as follows:
263
Model 3 (High correlation case): p = 6, β = (0.85, 0.85, 0.85, 0.85, 0.85, 0.85)T , and xt ∼
265
266
N (0, Σ), where Σ is a correlation matrix with off-diagonal elements 0.5. Model 4 (High-dimensional case): p = 60, β1:6 = (0.85, 0.85, 0.85, 0.85, 0.85, 0.85)T , β7:60 =
al
264
(0, · · · , 0)T , and xt ∼ N (0, Σ), where Σ is an identity matrix. Model 5 (Small n large p case): p > n, e.g., p = 150 and n = 100.
268
Based on BRBridge and BARBridge QR, we collect 15,000 posterior samples after discarding
269
5,000 burn-ins to perform Bayesian inference. The results of Models 3 and 4 are reported in Table
270
6. Compared with those in Models 1 and 2, the performances of the proposed methods are equally
271
good in Model 3, unsatisfactory when n = 100 but acceptable when n = 200 in Model 4, and
272
breaking down in Model 5. Furthermore, we notice that the estimated values of ξ in Model 4
273
are significantly smaller than those in Models 1 and 2, thereby reconfirming that the proposed
274
methods select smaller ξ for sparse coefficients than for dense coefficients. Hence, routinely fixing
275
ξ to a constant is unreasonable and may reduce estimation efficiency.
Jo
urn
267
17
Journal Pre-proof
Table 5: MMAD, MMSE and their standard errors (in parentheses) based on 100 replications
τ
Sample size
n = 100
n = 100
n = 200
n = 200
Error
N (0, 1)
t3
N (0, 1)
t3
Method
MMAD
MMSE
MMAD
MMSE
MMAD
MMSE
MMAD
MMSE
Model 1
0.75
.105(.073) .091(.053) .100(.063) .109(.060)
.281(.083) .294(.080) .298(.098) .295(.087)
.144(.091) .153(.082) .163(.127) .160(.094)
.164(.048) .165(.051) .172(.057) .170(.049)
.048(.027) .048(.029) .053(.035) .052(.031)
.188(.061) .202(.059) .208(.066) .211(.060)
.064(.041) .072(.043) .077(.049) .079(.045)
BARidge BALASSO BAL1/2 BARBridge
.239(.067) .240(.077) .228(.068) .253(.074)
.104(.059) .106(.073) .095(.060) .116(.069)
.299(.085) .284(.080) .312(.085) .307(.091)
.165(.098) .145(.081) .174(.100) .172(.118)
.171(.043) .164(.049) .163(.048) .172(.051)
.050(.024) .047(.029) .045(.025) .053(.030)
.200(.062) .209(.070) .214(.063) .207(.063)
.071(.048) .080(.061) .081(.048) .077(.048)
BRidge BLASSO BL1/2 BRBridge
.214(.057) .220(.061) .222(.064) .209(.066)
.084(.046) .086(.051) .087(.049) .077(.044)
.237(.077) .244(.080) .238(.074) .231(.075)
.104(.070) .113(.082) .103(.064) .099(.064)
.155(.043) .148(.044) .163(.051) .156(.044)
.042(.022) .039(.024) .047(.029) .042(.023)
.174(.062) .175(.054) .175(.054) .169(.052)
.055(.041) .054(.034) .054(.033) .051(.033)
BARidge BALASSO BAL1/2 BARBridge
.222(.072) .215(.069) .215(.070) .222(.065)
.089(.061) .083(.055) .085(.053) .091(.054)
.243(.082) .246(.074) .261(.087) .243(.085)
.113(.080) .113(.068) .127(.096) .109(.077)
.163(.045) .153(.046) .154(.045) .161(.048)
.046(.025) .042(.025) .041(.024) .046(.028)
.174(.049) .170(.051) .181(.057) .166(.047)
.052(.028) .051(.029) .058(.036) .048(.027)
BRidge BLASSO BL1/2 BRBridge
.234(.072) .222(.074) .236(.069) .236(.066)
.102(.066) .089(.059) .103(.064) .101(.054)
BARidge BALASSO BAL1/2 BARBridge
.238(.070) .234(.067) .241(.071) .233(.069)
.104(.060) .097(.054) .105(.063) .097(.060)
of
.236(.081) .223(.064) .236(.076) .247(.071)
p ro
0.50
BRidge BLASSO BL1/2 BRBridge
Pr e-
0.25
.284(.079) .300(.088) .295(.097) .294(.095)
.149(.087) .161(.098) .157(.098) .161(.111)
.162(.047) .164(.048) .166(.048) .168(.048)
.046(.027) .047(.027) .049(.029) .049(.027)
.210(.070) .201(.060) .206(.074) .209(.067)
.080(.051) .072(.042) .077(.057) .078(.052)
.290(.092) .283(.084) .288(.090) .301(.100)
.153(.100) .146(.088) .160(.108) .140(.125)
.164(.051) .169(.052) .173(.049) .166(.047)
.048(.029) .050(.030) .052(.028) .048(.026)
.196(.061) .202(.057) .206(.069) .204(.064)
.070(.044) .073(.041) .077(.058) .072(.048)
Model 2
0.75
.094(.056) .083(.051) .070(.053) .073(.051)
BARidge BALASSO BAL1/2 BARBridge
.200(.064) .180(.066) .159(.066) .157(.071)
.074(.048) .062(.043) .050(.041) .057(.048)
.286(.082) .263(.083) .221(.074) .232(.086)
.146(.086) .125(.072) .089(.057) .091(.063)
.168(.057) .149(.046) .140(.049) .143(.050)
.050(.032) .040(.024) .036(.026) .038(.028)
.207(.060) .195(.054) .173(.058) .176(.057)
.074(.044) .067(.041) .052(.034) .054(.036)
.255(.078) .225(.073) .197(.102) .206(.094)
.118(.072) .093(.062) .085(.111) .082(.079)
.207(.060) .138(.044) .130(.056) .129(.046)
.077(.047) .035(.022) .033(.027) .031(.022)
.207(.060) .174(.064) .148(.059) .150(.058)
.077(.047) .054(.037) .041(.032) .042(.031)
al
.226(.071) .213(.064) .194(.068) .201(.069)
urn
0.50
BRidge BLASSO BL1/2 BRBridge
BRidge BLASSO BL1/2 BRBridge
.202(.059) .197(.067) .173(.059) .171(.067)
.076(.046) .072(.050) .057(.045) .059(.048)
.245(.073) .218(.076) .197(.061) .201(.074)
.110(.067) .092(.063) .070(.040) .079(.057)
.154(.044) .141(.045) .134(.043) .131(.045)
.041(.024) .036(.023) .032(.022) .031(.020)
.171(.049) .160(.049) .146(.054) .135(.043)
.051(.030) .045(.029) .039(.032) .033(.023)
BARidge BALASSO BAL1/2 BARBridge
.190(.055) .170(.063) .157(.061) .158(.057)
.065(.041) .054(.042) .046(.035) .043(.037)
.222(.075) .194(.067) .169(.071) .167(.072)
.096(.065) .069(.051) .058(.058) .057(.049)
.161(.052) .125(.041) .111(.046) .114(.041)
.045(.026) .028(.017) .023(.018) .022(.018)
.161(.052) .142(.046) .117(.051) .112(.049)
.045(.026) .036(.023) .027(.026) .030(.028)
BRidge BLASSO BL1/2 BRBridge
.229(.066) .211(.068) .207(.076) .189(.067)
.095(.054) .081(.056) .082(.066) .067(.049)
.283(.076) .275(.080) .235(.091) .233(.083)
.144(.076) .137(.079) .106(.085) .103(.079)
.163(.050) .155(.041) .142(.051) .141(.051)
.046(.028) .042(.021) .036(.027) .036(.025)
.199(.061) .189(.069) .169(.067) .166(.063)
.071(.040) .064(.047) .054(.045) .052(.042)
BARidge BALASSO BAL1/2 BARBridge
.206(.058) .176(.070) .171(.078) .173(.070)
.076(.042) .068(.046) .057(.050) .059(.055)
.282(.082) .226(.076) .197(.075) .206(.094)
.148(.099) .094(.062) .076(.064) .073(.066)
.195(.063) .139(.052) .122(.049) .124(.047)
.069(.044) .036(.028) .027(.020) .028(.021)
.195(.063) .160(.059) .157(.059) .158(.060)
.069(.044) .047(.034) .045(.032) .045(.033)
Jo
0.25
18
Journal Pre-proof
Model 3, n = 100
0.50
0.75
β1
β2
β3
β4
β5
β6
ξ
MMAD
MMSE
Bias
−0.048
−0.063
−0.060
−0.031
−0.019
−0.039
0.795
0.322
0.268
RMSE
0.232
0.196
0.220
0.210
0.192
0.223
0.009
0.103
0.163
95% CL
0.292
0.394
0.455
0.421
0.425
0.345
0.772
95% CU
1.214
1.117
1.172
1.165
1.094
1.120
0.808
p ro
0.25
Para
Bias
−0.073
−0.036
−0.026
−0.019
−0.073
−0.040
0.798
0.296
0.214
RMSE
0.181
0.205
0.185
0.185
0.206
0.175
0.009
0.100
0.145
95% CL
0.443
0.389
0.487
0.447
0.318
0.519
0.779
95% CU
1.088
1.283
1.194
1.153
1.075
1.110
0.811
Pr e-
τ
of
Table 6: BRBridge QR results of Models 3 and 4: εt ∼ t3 , and τ = 0.5
Bias
−0.019
−0.058
−0.057
−0.047
−0.020
−0.044
0.795
0.324
0.275
RMSE
0.225
0.218
0.212
0.209
0.205
0.221
0.008
0.104
0.160
95% CL
0.407
0.320
0.438
0.418
0.370
0.364
0.779
95% CU
1.196
1.159
1.156
1.130
1.175
1.234
0.807
Model 4, n = 200
0.50
β1
β2
β3
β25
β35
β45
β55
ξ
MMAD
MMSE
Bias
−0.080
−0.086
−0.074
0.004
0.004
0.005
−0.013
−0.010
0.485
0.380
0.302
RMSE
0.186
0.160
0.174
0.047
0.068
0.053
0.054
0.046
0.059
0.074
0.128
95% CL
0.410
0.509
0.483
−0.088
−0.136
−0.091
−0.148
−0.099
0.398
95% CU
1.092
1.006
1.040
0.088
0.120
0.124
0.084
0.062
0.610
Bias
−0.086
−0.092
−0.071
−0.005
−0.002
−0.001
0.003
0.000
0.498
0.334
0.230
RMSE
0.154
0.150
0.147
0.039
0.036
0.048
0.045
0.047
0.062
0.078
0.109
95% CL
0.517
0.529
0.519
−0.075
−0.072
−0.083
−0.091
−0.100
0.401
0.985
0.963
1.064
0.075
0.061
0.123
0.084
0.091
0.621
Bias
−0.098
−0.08
−0.101
−0.003
0.000
0.000
−0.003
0.005
0.514
0.379
0.313
RMSE
0.174
0.160
0.189
0.050
0.053
0.055
0.040
0.050
0.073
0.098
0.163
0.462
0.472
0.379
−0.126
−0.091
−0.107
−0.105
−0.081
0.419
1.024
0.999
1.033
0.073
0.105
0.127
0.070
0.120
0.675
95% CU 0.75
β15
al
0.25
Para
95% CL 95% CU
urn
τ
Jo
Note: The columns “MMAD” and “MMSE” present their averages and standard deviations based on 100 replications.
19
Journal Pre-proof
276
5. Application In this section, we applied the proposed estimation procedures to analyze the Boston housing
278
dataset. This dataset was named as “BostonHousing” and can be downloaded using R pack-
279
age “mlbench” by inputting the following commands in R Console: data(”BostonHousing”, pack-
280
age=”mlbench”), original=BostonHousing. The original data include 506 observations and 13
281
predictors variables. The outcome variable is the median value of owner-occupied homes in USD
282
1,000’s (medv). The 13 predictor variables are as follows: per capita crime rate (crim); propor-
283
tion of residential land zone for lots over 25,000 square feet (zn); proportion of non-retail business
284
acres per town (indus); Charles River dummy variable (chas); nitric oxide concentration (parts
285
per 10 million, nox); average number of rooms per dwelling (rm); proportion of owner-occupied
286
units built prior to 1940 (age); weighted distances to five Boston employment centers (dis); in-
287
dex of accessibility to radial highways (rad); full-value property-tax rate per USD 10,000 (tax),
288
pupil–teacher ratio by town (ptratio); 1, 000(B − 0.63)2 , where B is the proportion of blacks by
289
town (bk); and percentage of lower status of the population (lstat).
Pr e-
p ro
of
277
290
We fitted the Boston housing dataset using simple linear regression with a constant term. The
291
ordinary LSE indicated that the linear regression model was a proper model. We partitioned the
292
whole dataset into a training set (first 456 samples) and test set (last 50 samples). Model fitting
293
294
295
was conducted on the training set, and performance is evaluated on the test set, which can be ∑506 1 T ˆ evaluated by MAD = 50 i=457 |yi − xi β|. Quantile levels 0.25, 0.50, and 0.75 were considered.
The convergence plot (not reported) showed that the MCMC algorithm converged within 10,000 iterations. Thus, for each case we ran the Gibbs sampling algorithm 20,000 iterations, discarded
297
the first 10,000 iterations as burn-ins, and used the remaining 10,000 samples to perform posterior
298
inference. Table 7 presents the parameter estimates and their standard error estimates (se). All
299
the methods under consideration specified a sparse regression model for the housing data. The
300
estimation results are similar to those reported in the literature (Alhamzawi and Algamal, 2018).
urn
301
al
296
Table 8 presents the MADs based on BRBridge, BARBridge, BRidge, BLASSO, BL1/2 , BARidge, BALASSO, and BAL1/2 QR. As shown in Table 8, BRBridge QR performs similarly to BL1/2 QR
303
over the three quantiles and has better prediction accuracy than BRidge and BLASSO QR due to
304
smaller MAD values. Meanwhile, BARBridge QR performs slightly better than BAL1/2 , BARidge,
305
and BALASSO QR over the three quantiles. In addition, nearly all the adaptive versions of the
306
estimation methods considered perform better than their non-adaptive versions.
Jo
302
20
0.75
0.50
−0.160(0.719) 0.015(0.057)
−0.015(0.018)
−0.424(0.367)
−0.289(0.214)
−0.058(0.117) 0.034(0.040)
−0.033(0.183)
6.061(1.278) −0.023(0.043) −0.799(0.527) 0.131(0.236) −0.014(0.013) −0.502(0.341) 0.012(0.010) −0.297(0.168) −0.025(0.124) 0.045(0.047) −0.015(0.200)
6.303(1.023) −0.010(0.040) −0.858(0.520) 0.239(0.269) −0.014(0.012) −0.554(0.400)
−0.012(0.020)
−0.154(0.292)
0.009(0.013)
−0.173(0.199)
−0.044(0.124)
0.030(0.048)
−0.022(0.149)
0.396(0.941)
0.193(0.398)
4.899(2.149)
−0.019(0.047)
−0.538(0.481)
0.089(0.214)
−0.013(0.017)
−0.334(0.324)
0.011(0.012)
−0.246(0.185)
−0.017(0.117)
0.038(0.049)
−0.013(0.140)
0.577(1.149)
0.189(0.472)
5.742(1.576)
−0.010(0.044)
−0.653(0.484)
0.171(0.237)
−0.012(0.016)
−0.413(0.347)
tax
ptratio
bk
latat
crim
zn
indus
chas
nox
rm
age
dis
rad
tax
ptratio
bk
latat
crim
zn
indus
chas
nox
rm
age
dis
rad
tax
ptratio
21
al
0.015(0.011) −0.338(0.165)
0.014(0.013)
−0.297(0.175)
bk
0.136(1.768)
2.091(2.307)
0.204(1.552)
1.238(1.975)
0.011(0.013)
latat
urn
0.032(0.184)
0.081(0.260)
−0.763(0.734)
−0.259(0.431)
dis
−0.256(0.520)
−0.006(0.074)
0.113(0.649)
−0.276(0.850)
0.002(0.214)
0.893(1.279)
0.243(0.626)
0.120(0.990)
0.015(0.560)
0.058(0.211)
−0.010(0.466)
−0.262(0.384)
0.012(0.031)
−0.150(0.546)
−0.013(0.043)
0.090(0.452)
−0.210(0.650)
−0.013(0.128)
0.651(1.027)
−0.344(0.159)
0.015(0.011)
−0.563(0.332)
−0.014(0.013)
0.247(0.265)
−0.878(0.483)
−0.013(0.043)
6.307(0.979)
0.149(1.275)
1.893(2.032)
−0.015(0.181)
0.047(0.044)
−0.019(0.149)
−0.303(0.160)
0.012(0.009)
−0.509(0.307)
−0.014(0.013)
0.145(0.232)
−0.810(0.455)
−0.024(0.041)
6.004(1.132)
0.238(0.981)
1.202(1.616)
−0.034(0.164)
0.036(0.040)
−0.067(0.119)
−0.294(0.196)
0.010(0.013)
−0.441(0.343)
−0.015(0.016)
0.082(0.250)
−0.750(0.567)
−0.032(0.050)
5.442(1.431)
0.250(1.480)
0.987(1.641)
−0.048(0.209)
0.029(0.051)
−0.106(0.143)
Est (se.)
BARidge Est (se)
BLASSO
6.103(1.048)
0.209(1.222)
1.287(1.797)
−0.031(0.150)
0.034(0.038)
−0.059(0.109)
−0.287(0.183)
0.011(0.011)
−0.429(0.341)
−0.015(0.015)
0.082(0.231)
−0.746(0.558)
−0.030(0.045)
5.530(1.340)
0.299(1.874)
0.957(1.961)
−0.038(0.172)
0.028(0.046)
−0.092(0.128)
Est (se)
BALASSO
−0.281(0.307)
0.014(0.028)
−0.352(0.523)
−0.011(0.036)
0.167(0.417)
−0.579(0.690)
−0.012(0.104)
4.174(2.466)
0.214(0.563)
0.566(1.179)
−0.015(0.311)
0.047(0.112)
−0.007(0.263)
−0.248(0.263)
0.012(0.020)
−0.247(0.419)
−0.013(0.027)
0.083(0.309)
−0.408(0.574)
−0.019(0.079)
BL1/2
−0.337(0.213)
0.104(0.013)
−0.547(0.467)
−0.014(0.018)
0.233(0.306)
−0.869(0.677)
−0.013(0.060)
6.301(1.103)
0.105(3.260)
2.111(2.698)
−0.017(0.199)
−0.292(0.170)
0.014(0.012)
−0.433(0.351)
−0.013(0.015)
0.176(0.229)
−0.667(0.482)
−0.011(0.045)
5.799(1.485)
0.181(0.493)
0.561(1.125)
−0.010(0.145)
0.041(0.045)
−0.017(0.122)
−0.249(0.167)
0.012(0.011)
0.347(0.312)
−0.013(0.015)
0.091(0.192)
−0.550(0.457)
−0.020(0.041)
4.906(2.136)
0.197(0.383)
0.383(0.827)
−0.022(0.129)
0.030(0.042)
−0.046(0.102)
−0.189(0.191)
0.009(0.012)
−0.182(0.285)
−0.012(0.018)
0.036(0.180)
−0.320(0.441)
−0.018(0.048)
2.884(2.614)
0.130(0.373)
0.205(0.685)
−0.024(0.147)
0.024(0.048)
−0.058(0.104)
Est (se)
of 0.045(0.056)
−0.020(0.165)
−0.300(0.161)
0.012(0.010)
−0.512(0.294)
−0.014(0.012)
0.132(0.211)
−0.797(0.472)
−0.023(0.038)
p ro
2.865(2.539)
0.181(0.453)
0.326(0.907)
−0.023(0.246)
0.038(0.083)
−0.059(0.192)
−0.174(0.241)
0.011(0.019)
−0.074(0.319)
−0.013(0.028)
0.030(0.250)
−0.114(0.392)
−0.015(0.080)
0.644(1.417)
0.079(0.301)
0.071(0.517)
−0.022(0.223)
0.032(0.082)
−0.069(0.172)
Pr e-
0.168(0.522)
0.090(0.759)
−0.022(0.397)
0.046(0.132)
−0.062(0.305)
−0.236(0.299)
0.011(0.024)
−0.079(0.393)
−0.015(0.034)
0.059(0.344)
−0.111(0.460)
−0.015(0.105)
rad
0.308(0.597)
5.512(1.545)
−0.030(0.052)
2.366(2.625)
−0.016(0.052)
rm
0.085(0.376)
0.252(1.743)
0.049(0.526)
−0.034(0.306)
age
Jo
0.103(0.332)
nox
0.964(1.840)
0.177(0.643)
chas
0.036(0.112)
0.027(0.049) −0.042(0.195)
0.024(0.058)
−0.022(0.147)
zn
indus
−0.093(0.228)
−0.091(0.135)
−0.053(0.117)
crim
0.25
BRidge Est (se)
Est (se)
Est (se)
BARBridge
BRBridge
Method
Parameter
τ
Table 7: Estimation results of regression coefficients for Boston housing dataset
−0.336(0.202)
0.014(0.013)
−0.538(0.420)
−0.013(0.017)
0.219(0.298)
−0.853(0.678)
−0.010(0.050)
6.305(1.265)
0.199(4.432)
2.076(2.990)
−0.008(0.254)
0.042(0.061)
−0.015(0.229)
−0.299(0.179)
0.012(0.010)
−0.490(0.333)
−0.014(0.012)
0.128(0.231)
−0.812(0.548)
−0.020(0.038)
6.026(1.151)
0.246(2.424)
1.329(2.368)
−0.027(0.153)
0.032(0.041)
−0.053(0.121)
−0.279(0.195)
0.010(0.010)
−0.426(0.379)
−0.014(0.014)
0.068(0.228)
−0.720(0.611)
−0.028(0.041)
5.550(1.373)
0.238(1.999)
0.936(2.118)
−0.036(0.171)
0.024(0.046)
−0.082(0.117)
Est (se)
BAL1/2
Journal Pre-proof
Journal Pre-proof
Table 8: MAD values of Boston housing dataset of different methods
τ = 0.50
τ = 0.75
BRidge
28.33
25.37
16.10
BLASSO
24.02
14.29
4.855
BL1/2
14.97
4.724
4.029
BRBridge
15.24
4.282
4.106
BARidge
7.084
3.359
4.898
BALASSO
5.838
3.189
4.324
BAL1/2
5.134
2.917
4.137
BARBridge
5.115
3.026
4.239
p ro
6. Conclusion
of
τ = 0.25
Pr e-
307
Method
In this study, we consider BRBridge and BARBridge QR. Simulations and a real-data example
309
are conducted to illustrate the proposed procedures. The two Bayesian penalized procedures
310
perform well in general and can specify and estimate significant regression variables for dense
311
and sparse cases. In the simulations, penalty index ξ has a left-skewed posterior distribution
312
on the interval (0, 1) for dense and sparse coefficients. However, the average estimation value of
313
ξ is mostly close to 1 and 0.5 for dense and sparse coefficients, respectively. For future work,
314
this phenomenon must be verified from a theoretical perspective. As shown in the simulations,
315
the proposed methods cannot manage the case of small n large p (p > n), thereby restricting
316
the application of our methods to the analysis of ultrahigh dimensional data. How to address
317
this problem is certainly of future research interest. Moreover, in this study we set the prior
318
distribution of penalty exponent ξ as a standard uniform distribution. Other prior distributions
319
of ξ, such as truncated normal or truncated Gamma distribution, for improving the estimation
320
efficiency of BRBridge and BARBridge QR are worthy of further investigation. Furthermore, we
321
illustrate the proposed BRBridge and BARBridge QR in the context of a linear regression model.
322
The proposed methods can apparently be applied to nonlinear models and many other statistical
323
models. Finally, several other methods assess the performance of the quantile estimation without
324
fixing τ . For example, the composite quantile regression (CQR) proposed by Zou and Yuan (2008)
325
assesses the estimation accuracy of the quantile function by averaging a series of quantile regression
326
estimators at different quantiles. Owing to its advantages in integrating various quantile levels to
327
improve estimation, CQR and its variants have received considerable attention in recent years (e.g.,
Jo
urn
al
308
22
Journal Pre-proof
Kai et al., 2010, 2011; Jiang et al., 2012, 2014; Wang et al., 2013; Zhao et al., 2017). Another
329
direction is to estimate the quanitle function Q(τ ) = F −1 (τ ) = inf {x : F (x) ≥ τ }, where F (x) is
330
the cumulative distribution function of random variable X, and Q(τ ) is a continuous function of τ
331
in interval (0, 1) (e.g., Cheng, 1995; Cai, 2010; Sankaran and Midhu, 2017). The quantile regression
332
without fixing τ may also be considered jointly with the proposed methods. The aforementioned
333
extensions will considerably enlarge the application scope of the proposed methods.
334
Acknowledgements
of
328
The work is supported by China Postdoctoral Science Foundation (No.2017M610156), National
336
Social Science Foundation of China (No.15ZDA009), Research Grant Council of the Hong Kong
337
Special Administration Region (No. 14303017, 14301918), and direct grants from the Chinese
338
University of Hong Kong.
339
References
342 343
344 345
346
Pr e-
341
Alhamzawi, R., Yu, K., Benoit, D. F. (2012). Bayesian adaptive Lasso quantile regression. Statistical Modelling, 12, 279-297.
Alhamzawi, R., Ali, H. T. M. (2018). Bayesian Tobit quantile regression with L1/2 penalty. Communications in Statistics – Simulation and Computation, 47(6), 1739-1750. Alhamzawi, R., Algamal, Z. Y. (2018). Bayesian bridge quantile regression. Communications in Statistics – Simulation and Computation, to appear.
Betancourt, B., Rodriguez, A., Boyd, N. (2017). Bayesian fused Lasso regression for dynamic binary
al
340
p ro
335
networks. Journal of Computational and Graphical Statistics, 26(4), 840-850.
348
Cai, Y. Z. (2010). Multivariate quantile function models. Statistica Sinica, 20, 481-496.
349
Cheng, C. (1995). The Bernstein polynomial estimator of a smooth quantile function. Statistics &
351 352
353 354
Probability Letters, 24(4), 321-330.
Deng, J., Pandey, M. D. (2008). Estimation of the maximum entropy quantile function using fractional probability weighted moments. Structural Safety, 30(4), 307-319. Davino, C., Furno, M., Vistocco, D. (2014). Quantile Regression: Theory and Applications. New York: John Wiley & Sons.
Jo
350
urn
347
355
Devroye,L. (1986). Non-Uniform Random Variate Generation. New York: Springer- Verlag.
356
Fan, J., Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties.
357
Journal of the American Statistical Association, 96(456), 1348-1360.
23
Journal Pre-proof
359
360 361
362 363
364 365
Fu, W. (1998). Penalized regressions: the bridge versus the Lasso. Journal of Computational and Graphical Statistics, 7(3), 397-416. Gefan, D. (2014). Bayesian doubly adaptive elastic-net Lasso for VAR shrinkage. International Journal of Forecasting, 30(1), 1-11. Geraci, M., Bottai, M. (2007). Quantile regression for longitudinal data using the asymmetric Laplace distribution. Biostatistics, 8(1), 140-154. Hunter, D. R., Lange, K. (2000). Quantile regression via an MM algorithm. Journal of Computational
of
358
and Graphical Statistics, 9(1), 60-77.
Huang, Y. (2016). Quantile regression-based Bayesian semiparametric mixed-effects models for longitu-
367
dinal data with non-normal, missing and mismeasured covariate. Journal of Statistical Computation
368
and Simulation, 86(6), 1183-1202.
370
371 372
373 374
Huang, J., Horowitz, J. L., Ma, S. (2008). Asymptotic properties of bridge estimators in sparse highdimensional regression models. Annals of Statistics, 36(2), 587-613.
Jiang, X. J., Jiang, J. C., Song, X. Y. (2012). Oracle model selection for nonlinear models based on weighted composite quantile regression. Statistica Sinica, 22, 1479-1506.
Pr e-
369
p ro
366
Jiang, X. J., Jiang, J, C., Song, X. Y. (2014). Weighted composite quantile regression estimation of DTARCH models. Econometrics Journal, 17, 1-23.
375
Kai, B., Li, R. Z., Zou, H. (2010). Local composite quantile regression smoothing: an efficient and safe
376
alternative to local polynomial regression. Journal of the Royal Statistical Society: Series B, 72,
377
49-69.
379
Kai, B., Li, R. Z., Zou, H. (2011). New efficient estimation and variable selection methods for semiparametric varying-coefficient partially linear models. Annals of Statistics, 39, 305-332.
al
378
Knight, K., Fu, W. (2000). Asymptotics for lasso-type estimators. Annals of Statistics, 28(5), 1356-1378.
381
Koenker, R. (2005). Quantile regression. Cambridge: Cambridge University Press.
382
Koenker, R., Bassett, J. (1978). Regression quantiles. Econometrica, 46(1), 33-50.
383
Koenker, R., Park, B. J (1996). An interior point algorithm for nonlinear quantile regression. Journal of
385 386
387 388
389 390
391
Econometrics, 71(12), 265-283.
Koenker, R., Chernozhukov, V., He, X., Peng, L. (2017). Handbook of Quantile Regression. Portland: Chapman and Hall.
Kozumi, H., Kobayashi, G (2011). Gibbs sampling methods for Bayesian quantile regression. Journal of
Jo
384
urn
380
Statistical Computation and Simulation, 81, 1565-1578. Kobayashi, G., Kozumi, H. (2013). Bayesian analysis of quantile regression for censored dynamic panel data. Computational Statistics, 27, 359-380. Mallick, H., Yi, N. (2018). Bayesian bridge regression. Journal of Applied Statistics, 45(6), 988-1008.
24
Journal Pre-proof
398 399
400 401
402 403
404 405
406 407
408 409
410 411
412 413
414 415
416 417
418 419
420 421
Society: Series B, 76(4), 713-733. Reich, B. J., Fuentes, M., Dunson, D. B. (2011). Bayesian spatial quantile regression. Journal of the American Statistical Association, 106, 6-20.
of
397
Polson, N. G., Scott, J. G., Windle, J. (2014). The Bayesian bridge. Journal of the Royal Statistical
Sankaran, P. G., Midhu, N. N (2017). Nonparametric estimation of mean residual quantile function under right censoring. Journal of Applied Statistics, 44(10), 1856-1874.
p ro
396
103(482), 681-686.
Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society, Series B, 24(3), 267-288.
Tian, Y., Tian, M., Zhu, Q. (2014). Linear quantile regression based on EM algorithm. Communications in Statistics – Theory and Methods, 43(16), 3464-3484.
Tian, Y., Li, E., Tian, M. (2016). Bayesian joint quantile regression for mixed effects models with
Pr e-
395
Park, T. and Casella, G. (2008). The Bayesian Lasso. Journal of the American Statistical Association,
censoring and errors in covariates. Computational Statistics, 31(3), 1-27. Xu, Z., Zhang, H., Wang, Y., Chang, X., Liang, Y. (2010). L1/2 regularization. Science China Information Sciences, 53(6), 1159-1169.
Yu, K., Moyeed, R. A. (2001). Bayesian quantile regression, Statistics and Probability Letters, 54, 437447.
Zhao, K. F., Lian, H. (2015). Bayesian Tobit quantile regression with single-index models. Journal of Statistical Computation and Simulation, 85(6), 1247-1263.
al
394
Applied Statistics, 33(9), 1031-1032.
Zhao, W. H., Lian, H., Song, X. Y. (2017). Composite quantile regression for correlated data. Computational Statistics and Data Analysis, 109, 15-33. Zou, H. (2006). The adaptive LASSO and its oracle properties. Journal of the American Statistical
urn
393
Nadarajah, S. (2006). Acknowledgement of priority: the generalized normal distribution. Journal of
Association, 101, 1418-1429.
Zou, H., Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society, Series B, 2010, 67(2), 301-320. Zou, H., Yuan, M. (2008). Composite quantile regression and the oracle model selection theory. The Annals of Statistics, 36, 1108-1126.
Jo
392
25