Predicting failure risk using financial ratios: Quantile hazard model approach

Predicting failure risk using financial ratios: Quantile hazard model approach

North American Journal of Economics and Finance xxx (xxxx) xxx–xxx Contents lists available at ScienceDirect North American Journal of Economics and...

559KB Sizes 2 Downloads 86 Views

North American Journal of Economics and Finance xxx (xxxx) xxx–xxx

Contents lists available at ScienceDirect

North American Journal of Economics and Finance journal homepage: www.elsevier.com/locate/najef

Predicting failure risk using financial ratios: Quantile hazard model approach Manh Cuong Donga, Shaonan Tianb, Cathy W.S. Chenc, a b c



Department of Economics, Feng Chia University, Taichung, Taiwan Department of Marketing & Decision Sciences, San Jose State University, San Jose, USA Department of Statistics, Feng Chia University, Taichung, Taiwan

AR TI CLE I NF O

AB S T R A CT

JEL classifications: G17 G33 C21

This study examines the role of financial ratios in predicting companies’ default risk using the quantile hazard model (QHM) approach and compares its results to the discrete hazard model (DHM). We adopt the LASSO method to select essential predictors among the variables mentioned in the literature. We show the preeminence of our proposed QHM through the fact that it presents a different degree of financial ratios’ effect over various quantile levels. While DHM only confirms the aftermaths of “stock return volatilities” and “total liabilities” and the positive effects of “stock price”, “stock excess return”, and “profitability” on businesses, under high quantile levels QHM is able to supplement “cash and short-term investment to total assets”, “market capitalization”, and “current liabilities ratio” into the list of factors that influence a default. More interestingly, “cash and short-term investment to total assets” and “market capitalization” switch signs in high quantile levels, showing their different influence on companies with different risk levels. We also discover evidence for the distinction of default probability among different industrial sectors. Lastly, our proposed QHM empirically demonstrates improved out-of-sample forecasting performance.

Keywords: Default risk Discrete hazard model Quantile hazard model LASSO Industrial dummy variables

1. Introduction The literature has long studied the modeling and forecasting of default risk at all levels. The occurrences of bankruptcy filing events bring strong negative effects to an economy and society, not to mention massive financial losses. Developing a bankruptcy prediction model that can help gain insights into understanding the relationship between a company’s financial performance and its default risk has thus become critical to creditors, shareholders, and regulators especially under the current framework of BASEL II, which sets a higher standard on the calculation of cash reserves. Given the broad interests in various areas, the literature has extensively studied financial distress prediction models. For example, Schwaab, Koopman, and Lucas (2016) conduct a study using data on 41 countries to find the dynamic properties of systematic default risk; Switzer, Wang, and Tu (2016) examine the relationship between corporate governance and default risk in financial firms in 28 countries outside of North America over the post-financial crisis period; and Memmel, Gunduz, and Raupach (2015) attempt to find the common drivers of default risk across different industries in Germany and many other countries. In spite of its importance, most research studies prior to the 2000s adopt a static single-period classification model to estimate default risk using cross-sectional financial ratios – for example, Beaver (1966, 1968), Altman (1968), Ohlson (1980), Zmijewski



Corresponding author. E-mail address: [email protected] (C.W.S. Chen).

https://doi.org/10.1016/j.najef.2018.01.005 Received 28 August 2017; Received in revised form 1 January 2018; Accepted 8 January 2018 1062-9408/ © 2018 Elsevier Inc. All rights reserved.

Please cite this article as: Dong, M.C., North American Journal of Economics and Finance (2018), https://doi.org/10.1016/j.najef.2018.01.005

North American Journal of Economics and Finance xxx (xxxx) xxx–xxx

M.C. Dong et al.

(1984), etc. Despite the time-varying characteristics of firms that are observed in those models, the single-period logistic model utilizes only one outcome observation per company to estimate default risk. To overcome this shortcoming, Shumway (2001) develops a discrete hazard model (DHM hereafter) to predict corporate bankruptcy using historical time-varying data. DHM demonstrates success in providing an improved estimate of bankruptcy probability, while the static model ignores the fact that a firm’s health status changes over time and thus may produce a biased and inconsistent probability estimate. DHM has hence quickly gained popularity, with many recent bankruptcy prediction studies adopting it – for example, Chava and Jarrow (2004), Campbell, Hilscher, and Szilagyi (2008), and Ding, Tian, Yu, and Guo (2012). While a vast amount of literature focuses on developing a bankruptcy prediction model using time-varying firm-specific predictor variables, it only provides an “overview” or a “snapshot” with a summary of the relationship between the predictor variables and default risk in the condition mean value. Just as the mean provides incomplete information about any distribution, a full description of a regression curve at all points would be helpful in revealing more useful information with a grand summary. In a similar vein, the current bankruptcy literature encompasses the relationship between default risk and its firm-specific predictor variables using a single regression model, but such a parametric relationship in a firm’s behaviors may be heterogeneous across quantiles in bankruptcy prediction. In this work we introduce a quantile hazard regression model to discover how a company’s behavior at different quantiles affects default risk and present a grand summary of the relationship between the predictor variables and a firm’s default status over time. More specifically, we divide companies into quantiles based on their probability of default and find that the relationships between the default status and the predictor variables using accounting and market data are quite different at the upper tail quantile levels versus the estimates at lower tail or middle quantile levels. The statistics literature has extensively studied the quantile regression model, demonstrating success at discovering extreme behaviors by examining distributions, especially at tail quantiles, and by presenting a more thorough view in the distribution description (Koenker, 2005). This approach is strongly suitable for looking at companies’ default risk, because of its asymmetric distribution (the number of failure companies is usually much less than the number of survival ones). In a similar vein, Siao, Hwang, and Chu (2016) use logistic quantile regression to study companies’ recovery rates (after bankruptcy filing). While applying the quantile approach, their work repeats the shortcomings of static models in ignoring the time-varying characteristics of firms and uses only one outcome observation value per company. Motivated by the developments in Koenker and Bassett (1978), Bassett and Koenker (1982), Yu, Lu, and Stander (2003), and Chen, Gerlach, and Wei (2009), in this work we propose to adopt quantile analysis into a hazard model and use the quantile hazard model (QHM hereafter) to investigate how default risk correlates with the determinants at different quantile levels using time-varying accounting and market data. We aim to complement the current gap in the literature by thoroughly exploring the relationship between a company’s financial performance and its default risk across different quantile levels in order to discover some potential “tail properties”. This work adopts the proposed QHM on the bankruptcy database using a set of predictor variables selected by the least absolute shrinkage and selection operator (LASSO) technique. The LASSO method is a popular statistical variable selection approach that provides a parsimonious predictor variable-set solution through penalizing the regression coefficient via a shrinkage method (for details, see Tibshirani, 1996; Efron, Hastie, Johnstone, & Tibshirani, 2004; Meier, Van de Geer, & Bühlmann, 2008; Pereira, Basto, & Ferreira da Silva, 2016). Tian, Yu, and Guo (2015) show that the LASSO method demonstrates strong success in improving the bankruptcy prediction model’s forecasting accuracy using a comprehensive U.S. database. For this work, we use LASSO method to choose the predictor variables so as to study the prediction performance using our proposed QHM model.1 Our empirical study reveals that the loadings of the predictor variables at high quantile levels are quite different versus the estimates from fitting the discrete hazard model. In particular, for five predictor variables – stock volatility (SIGMA), stock excess return (EXRETURN), the profitability ratio of net income over the market value of total assets (NIMTA), the liability ratio of total liability over the market value of total assets (LTMTA), and the log of stock price (PRICE) – the scale of the coefficient estimates in predicting a default increases with the expected signs over high quantile levels. Such findings show that these variables have become increasingly important measures in predicting bankruptcy. Conversely, the liquidity ratio of cash and short-term investment to the market value of total assets (CASHMTA), the current liabilities to total asset (LCTAT) ratio, and the market capitalization variable (RSIZE) that are insignificant in DHM are found to be important indicators in high quantile levels of QHM. More interestingly, CASHMTA and RSIZE switch signs over the high quantile levels, indicating their different effects toward low and high risk companies. Our empirical results also denote that the coefficient estimates at middle quantile levels are similar to the estimates from the discrete hazard model fitting, but the estimates from low quantile levels are instead not statistically significant in predicting corporate default. We use QHM to evaluate the in-sample fitting over the training data from 2000 to 2009 and its out-of-sample performance on the testing data from 2010 to 2014. The forecasting results illustrate that QHM improves the out-of-sample performance compared to the popular discrete hazard model. In addition, we investigate the differences in default risk that may exist between various industrial sector firms by introducing three industrial dummy variables used in Chava and Jarrow (2004). Our empirical study displays that the significance and the magnitude of the industrial dummy variables are different across quantile levels of QHM and also different from DHM’s results. Such a finding is rather interesting especially for risk management purposes as more and more global investors are striving to formulate diverse portfolios. The rest of the paper runs as follows. Section 2 presents DHM, the LASSO method, and our proposed QHM model in greater detail. Section 3 provides a description about our bankruptcy database, the detailed LASSO procedure, and the construction of industrial dummy variables. Section 4 summarizes the empirical results of the predictive ability of the financial ratios on corporate default

1

We also try other sets of predictor variables, like the variable set in Campbell et al. (2008). The results are qualitatively similar.

2

North American Journal of Economics and Finance xxx (xxxx) xxx–xxx

M.C. Dong et al.

based on QHM and compares them to DHM. Section 5 offers some concluding remarks. 2. The statistical models The following section gives general ideas of the discrete hazard model in Shumway (2001), the LASSO method, and the proposed quantile hazard model. 2.1. Discrete hazard model The default risk estimation literature theoretically prefers DHM to a single-period logistic regression, because it adopts all available information to calculate the probability of bankruptcy for firms at different times. By using all available data, it avoids the selection biases inherent in static models. Thus, DHM has quickly gained popularity in the current literature. In our study we adopt DHM as a benchmark model for comparison purposes. DHM links the binary response variable and the predictor variables through a logistic model. More specifically, yi,t is a binary indicator of company i’s bankruptcy status at time t. We define yi,t = 1 if company i has filed for bankruptcy at time t and yi,t = 0 otherwise. Let x (discrete or continuous) be the explanatory variables. Specifically, x i,t is a time-varying vector of variables observed at time t for company i. We write the discrete hazard model for one-step-ahead default risk prediction as follows:

P (yi,t + 1 = 1|yi,t = 0,x i,t) =

exp(β0 + β′x i,t) , 1 + exp(β0 + β′x i,t)

(1)

where β0 is a scale parameter, and β is a vector of estimated parameters corresponding to explanatory variables x . 2.2. Variable selection using LASSO Variable selection has been indispensable for identifying most suitable predicted variables and for enhancing forecasting performance in statistics. The LASSO method, first introduced by Tibshirani (1996), is one of the most popular variable selection methods used in the existing bankruptcy literature since it enjoys some very nice statistical features. First, the LASSO method is able to provide a stable estimation even when the events of interest (“bankruptcy” in our work) are rather rare. Second, the shrinkage method adopted in the LASSO method may bring improved forecasting performance. Third, the LASSO method offers an entire variable selection path. Therefore, we are able to identify the relative importance of each variable. Such an approach further enhances the model’s interpretability. Fourth, the LASSO method is able to handle the multicollinearity problem, which is very important in our study since the financial ratios sometimes have strong correlations among themselves. Lastly, the LASSO method is computationally efficient, especially when there are a large number of candidate predictors. In this work we apply the LASSO technique to select the best predictor variable set and obtain the LASSO-selected parameters of DHM in Eq. (1) by minimizing the function: n



p

(−yi,t + 1 (β0 + β′x i,t) + log(1 + exp(β0 + β′x i,t))) subject to

i=1

|βk | ⩽ s,

k=1 n

or equivalent to





p

(−yi,t + 1 (β0 + β′x i,t) + log(1 + exp(β0 + β′x i,t)))−λ ∑ |βk |,

i=1

k=1

where n is the number of firms in the dataset, and p is the number of predictive variables used in DHM. The tuning parameter λ or s controls the amount of penalization. Note that a smaller value of s leads to a smaller number of selected variables. 2.3. Quantiles hazard model For the procedure of dividing a binary outcome into quantiles, one may refer to Koenker (2005). To apply quantile analysis into a logistic regression, Bottai, Cai, and McKeown (2010) propose an advanced model, called logistic quantile regression, which deals not only with binary outcomes, but also outcomes in a bounded range over different quantile levels. We apply this idea into the discrete hazard model and develop QHM as follows. We again let y be the output variable that is observable in our bankruptcy database with each element yi,t as a binary outcome of company i at time t; and x is a set of input variables with x i,t as a time-varying vector of variables observed at time t for company i. We define the conditional quantile of y at probability level τ ∈ (0,1) as qτ (y ) . For example, q0.5 (y ) is the value of y that divides the conditional distribution of outcomes into halves. To analyze default risk with dichotomous data, we now need a transformation to model default probability. The proposed function model follows Bottai et al. (2010) and has the form below:

3

North American Journal of Economics and Finance xxx (xxxx) xxx–xxx

M.C. Dong et al.

yi,t + ε ⎞ h (yi,t ) = log ⎜⎛ ⎟ = β0 + β′x i,t , ⎝ 1−yi,t + ε ⎠

(2)

where h (·) is the transformation function, and ε is a small quantity added to both numerator and denominator to make sure that h (·) is significant when the observation equals the lower or the upper limit. Applying the quantile function, we observe:

q (y ) + ε ⎞ h (qτ (y )) = log ⎜⎛ τ ⎟ = β0τ + β′τ x, ⎝ 1−qτ (y ) + ε ⎠ where β0τ is a constant, and βτ is the coefficient vector at any specified quantile level τ . Solving for qτ (y ) gives the result:

qτ (y ) =

(1 + ε )exp(β0τ + β′τ x)−ε · 1 + exp(β0τ + β′τ x)

(3)

Note that the transformation function used in this method is different from DHM and the logistic regression, which transform the probabilities. Here, we simply estimate the regression coefficients using the quantile regression of transformed outcome h (yi,t ) (Orsini & Bottai, 2011). According to Koenker and Bassett (1978) and Koenker (2005), we can estimate this by minimizing: n

Min

β0τ ,βτ

k

∑∑

ρτ [h (yi,t )−h (qτ (y))],

i=1 t=1

where ρτ (·) is the loss function defined by ρτ (u) = u × (τ −I(u < 0) ) , and I is an indicator function. Moreover, k and n are the number of observed years and the number of observed firms in each year, respectively. The value of n in each year is different due to the entrance or exit of firms. To choose the efficient ε value in transformation function (2) for any particular dataset, one could follow the procedure in Siao et al. (2016). More specifically, we calculate estimated ⌢ qτ (yi,t ) in Eq. (3) using various values of ε and then calculate the average of the ⌢

absolute difference between those estimated qτ (yi,t ) using every two consecutive values of chosen ε in different quantile levels. Next, we choose the ε value in which the above average of absolute difference is smallest, because it indicates that the estimated values of ⌢ qτ (yi,t ) do not change too much when ε values move around the chosen one. Our specific procedure for choosing a suitable ε value is in Section 4. Following QHM above, we provide the one-step-ahead prediction for default risk using each set of coefficients over different quantile levels as follows:

P (yi,t + 1 = 1|yi,t = 0,x) =

(1 + ε )exp(β0τ + β′τ x i,t)−ε · 1 + exp(β0τ + β′τ x i,t)

(4)

Eq. (4) presents various prediction results for a company’s default risk under different sets of coefficients achieved over different quantile levels. 3. Data statistics and variables’ selection result This section provides descriptions about our dataset, the variable selection process using the LASSO technique, and the industrial dummy variables. 3.1. Data statistics We construct our bankruptcy database used herein by merging daily and monthly market trading data from CRSP and accounting information from COMPUSTAT. Our sampling period covers observations from January 2000 to December 2014. Our dataset contains 9501 firms with 67,026 firm-month observations. To construct the binary indicator response variable y, we define a bankruptcy event if a company has filed for bankruptcy under either the Chapter 7 (Liquidation) or Chapter 11 (Reorganization) bankruptcy protection code. For firms that have multiple bankruptcy filings, we only consider the first filing event. For the choice of candidate predictor variables, we construct a total of 15 predictor variables. This is a combination of the most popular Altman (1968)’s z-score model, Campbell et al. (2008)’s model, and Tian et al. (2015)’s model. In particular, we construct stock return volatility (SIGMA), stock excess return (EXRETURN), profitability ratios of net income to the market value of total assets (NIMTA) and retained earnings to the book value of total asset (REAT), liability ratios of total liability to the market value of total assets (LTMTA), current liability to the market value of total assets (LCTMTA) and current liability to the book value of total assets (LCTAT), liquidity ratios of cash and short-term investment to the market value of total assets (CASHMTA) and working capital to the book value of total assets (WCAPAT), the capital-turnover ratio of sales to the book value of total assets (SALEAT), the productivity ratio of earnings before

4

North American Journal of Economics and Finance xxx (xxxx) xxx–xxx

M.C. Dong et al.

Table 1 Variables’ description. Variable

Description *

1 if the company goes into bankruptcy; otherwise 0 Stock return volatilities Stock return in excess of market return (S&P 500 index) Net income/market value of total assets Total liabilities/market value of total assets Cash and short-term investment/market value of total assets Log(market capitalization) Market-to-book ratio Log(stock price) Earnings before interest and taxes/book value of total assets Retained earnings/book value of total assets Working capital/book value of total assets Sales/book value of total assets Market equity (yearly)/book value of total debt Total liabilities/market value of total assets Current liabilities/book value of total assets

DEFAULT SIGMA EXRETURN NIMTA LTMTA CASHMTA RSIZE MB PRICE EBITAT REAT WCAPAT SALEAT MVEF LCTMTA LCTAT * Denotes the output variable.

interest and tax to the book value of total asset (EBITAT), market equity to the book value of total debt (MVEF), market capitalization (RSIZE), the market-to-book ratio (MB), and the log of stock trading price truncated at $15 (PRICE). Consistent with Campbell et al. (2008), we use the standard deviation of each firm’s daily stock return over the past three months to calculate the stock return volatility (SIGMA). Hence, for all the market-based variables including SIGMA, EXRETURN, and PRICE, we update the predictor variables monthly. To carefully align the annually updated accounting-based variables, we first adjust the company’s fiscal year to the calendar year and lag the accounting records by four months. Such an adjustment is common in the bankruptcy literature for the purpose of ensuring all accounting information is available to the market over the time when the default risk is estimated. We next temporarily align accounting-based predictor variables with market-based predictor variables at a monthly-updated frequency. Table 1 summarizes the variables’ description. Table 2 illustrates the descriptive statistics of all the variables used in this study. For the DEFAULT variable, we find that most of its observations cluster at the value zero, because the mean value is very low. This indicates that the number of operating companies comprises the majority as compared to default ones. This phenomenon lends support to using the quantile approach instead of analysis on the mean.

3.2. Variable selection using LASSO For the choice of the predictor variables used in QHM, we adopt LASSO to select financial variables for corporate bankruptcy forecasts. We apply the LASSO variable selection method to the full dataset and report the result in Fig. 1. The horizontal axis Table 2 Descriptive statistics. Variable

Mean

Median

Min

Max

Standard deviation

DEFAULT SIGMA EXRETURN NIMTA LTMTA CASHMTA RSIZE MB PRICE EBITAT REAT WCAPAT SALEAT MVEF LCTMTA LCTAT

0.0070 0.5554 0.0002 −0.0141 0.4875 0.1069 −10.3520 1.9579 2.1024 0.0091 −0.3388 0.0795 0.8116 0.1177 0.1461 0.1928

0 0.4221 0.0035 0.0132 0.4547 0.0512 −10.3677 1.5214 2.7081 0.0468 0.0510 0.0887 0.6229 0.0053 0.0965 0.1607

0 0.1044 −0.4798 −0.6972 0.0085 0.0002 −14.8038 −1.9387 −1.5198 −0.8774 −5.7605 −0.3510 0.0073 0.0001 0.0107 0.0345

1 2.2289 0.4375 0.2086 0.9821 0.9040 −5.4137 9.6744 2.7081 0.3210 0.7414 0.8035 4.4172 2.8542 0.7644 0.8158

0.0836 0.4110 0.1346 0.1296 0.2916 0.1557 2.1200 1.8528 0.9129 0.1972 1.3385 0.3142 0.7847 0.4568 0.1585 0.1585

5

North American Journal of Economics and Finance xxx (xxxx) xxx–xxx

M.C. Dong et al.

Fig. 1. Coefficient path using the LASSO technique.

illustrates the constraint on the LASSO process, while the vertical axis reports the estimated parameters associated with the constraints. When the constraints are extremely tight, virtually all the coefficients become zero. When the restrictive constraints are relaxed, the coefficient estimate starts to become non-zero. The most significant variable enters the model first, followed by the less significant variables. Fig. 1 exhibits variables with the highest predictive performances chosen by the LASSO method, including PRICE, SIGMA, LTMTA, NIMTA, LCTAT, CASHMTA, RSIZE, and EXRETURN (the names of the variables are listed by their sequence of entering into the process). Our results are quite similar to the chosen variables in Campbell et al. (2008) and Tian et al. (2015). In fact, our selected variables match seven out of eight variables used in Campbell et al. (2008) and match six out of seven variables used in Tian et al. (2015). 3.3. Industrial dummy variables In this study we also consider the industrial dummy variables in order to provide some insights at capturing the differences in default probabilities that may exist between different sector firms. To construct those variables, we use the Standard Industrial Classification (SIC) code to identify a firm’s industry. We adopt the ten industry sector categories used in Chava and Jarrow (2004)’s model. Table 3 summarizes the SIC code and industry sector information. We further combine the ten industrial sectors into four groups: group 1 includes finance, insurance, and real estate industries; Table 3 SIC codes and corresponding industrial sectors. SIC Code

Industrial sectors

0100-0999 1000-1499 1500-1799 2000-3999 4000-4999 5000-5199 5200-5999 6000-6799 7000-8999 9100-9999

Agriculture, Forestry, and Fishing Mining Construction Manufacturing Transportation, Communications, Electric, Gas, and Sanitary service Wholesale Trade Retail Trade Finance, Insurance, and Real Estate Services Public Administration

6

North American Journal of Economics and Finance xxx (xxxx) xxx–xxx

M.C. Dong et al.

Table 4 The average of the absolute difference between estimated qτ̂ (yi,t |ε ) , using every two consecutive values of ε (values are multiplied by 100). Epsilon (ε )

0.0001 0.0005 0.001 0.005 0.01 0.05 0.1 0.5

Quantiles 0.05

0.25

0.5

0.75

0.95

Average

NA 0.0509 0.0513 0.3968 0.4868 3.5665 3.7891 16.6671

NA 0.0515 0.0519 0.4003 0.4903 3.5826 3.7985 16.6712

NA 0.0569 0.0576 0.4346 0.5226 3.7280 3.8799 16.6982

NA 0.1956 0.1665 0.9145 0.8801 4.9504 4.3846 16.4115

NA 2.3626 1.2642 3.7381 2.0566 6.1675 3.3404 8.7362

NA 0.5435 0.3183 1.1769 0.8873 4.3990 3.8385 15.0369

Note: The values are NAs when ε = 0.001, because it is the first value in the chosen set and there is no previous value comparison.

group 2 includes transportation, communications, and utility industries; group 3 includes manufacturing and mineral industries; group 4 includes miscellaneous industries. We create three dummy variables for the first three groups under the names D1, D2, and D3, respectively. The detailed construction of the industry variable is as follows:

1 D1 = ⎧ ⎨ ⎩0 1 D2 = ⎧ ⎨0 ⎩

if 6000 ⩽ SIC code ⩽ 6799 , otherwise if 4000 ⩽ SIC code ⩽ 4999 , otherwise

1 if 1000 ⩽ SIC code ⩽ 1499 or 2000 ⩽ SIC code ⩽ 3999 D3 = ⎧ , ⎨ ⎩ 0 otherwise with miscellaneous industries including companies having other SIC codes. After including the industry dummy variables, our DHM model in Eq. (1) and the QHM model in Eq. (4) have a total of eleven predictor variables x = (x1,…,x 8,D1,D2,D3) ′ and eleven corresponding coefficients β = (β1,…,β11) ′, which are associated with SIGMA, EXRETURN, NIMTA, LTMTA, CASHMTA, RSIZE, PRICE, LCTAT, D1, D2, and D3, respectively. 4. Empirical results and discussions We now use the selected variables to investigate their predicted ability toward default risk. One can make statistical inferences in a discrete hazard model by using a logit program (Shumway, 2001), which can also be applied in estimating QHM. For our model estimation, we use R software with package lqr.2 We first find the most suitable value of ε for the transformation Eq. (2) for our dataset. As described in Section 2.2 above, we estimate the average of the absolute difference between estimated qτ ̂ (yi,t ) in Eq. (3) using every two consecutive values of ε , with ε being in the chosen set {0.0001, 0.0005, 0.001, 0.005, 0.01, 0.05, 0.1, 0.5}. Specifically, we calculate the equation n N−1 ∑i = 1 |qτ ̂ (yi,t |ε2)−qτ ̂ (yi,t |ε1)| with ε2 > ε1, and N is the total number of firm years. The results in Table 4 show that those average differences are quite small in the range ε ∈ [0.001,0.01] and take the smallest value at ε = 0.001. It indicates that the estimated values of qτ ̂ (yi,t |ε ) remain nearly the same when ε values move around ε = 0.001. Therefore, for the rest of our research, we use ε = 0.001 for our estimated models and forecasting. Table 5A presents the coefficient estimates of DHM and QHM over different quantile levels. Note that the left-hand side of QHM used here is a transformation function under different quantile levels rather than a mean. Thus, βτj represents the change in the transformation of the outcome variable in the τ quantile level associated with a unit change of the jth input variable, holding all other predictors constant. The first two rows of Table 5A present the estimated coefficients of the predictor variables under the DHM model. Specifically, the first row summarizes the results with the eight LASSO-selected predictor variables, while the second row presents the estimation results using the eleven predictor variables including three industrial dummy variables. Both DHM model specifications show that firms with higher “stock return in excess to market return”, “net income over total assets”, and “stock price” are less likely to fail. Such a result is consistent with our expectation for the financial market. Positive net income generally manifests good business conditions and predicts a firm’s strong prospect in the future. Thus, it may attract the attention of investors and further increase the firm’s excess return and stock price. Moreover, “stock return volatilities” and “total liabilities over total assets” enter the prediction model with positive signs, which is not surprising since a more volatile company with higher liabilities usually indicates a high-risk condition. On the other hand, the insignificant coefficient estimates of CASHMTA, RSIZE, and LCTAT in DHM denotes that those three predictor variables may not be able to predict the survival of a company. 2 Consistent to DHM as discussed in Shumway (2001), in our QHM estimation we also adjust the sample size to account for the lack of independence between firmyear observations for test statistics calculation when the estimation results are directly available.

7

0.0745** (0.0003)

0.1080** (0.0000)

0.1556** (0.0000)

0.2243** (0.0000)

0.3250** (0.0000)

0.4750** (0.0000)

0.7011** (0.0000)

1.0408** (0.0000)

1.5179**

−6.6979** (0.0000)

−6.5958** (0.0000)

−6.4471** (0.0000)

−6.2291** (0.0000)

−5.9074** (0.0000)

−5.4321** (0.0000)

−4.7560** (0.0000)

−3.9518**

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.0218 (0.2855)

−6.8728** (0.0000)

0.3

−6.7684** (0.0000)

0.0134 (0.5116)

−6.8877** (0.0000)

0.25

0.45

0.0076 (0.7086)

−6.8973** (0.0000)

0.2

0.0507* (0.0132)

0.0038 (0.8509)

−6.9032** (0.0000)

0.15

−6.8170** (0.0000)

0.0015 (0.9402)

−6.9066** (0.0000)

0.1

0.4

0.0003 (0.9865)

−6.9083** (0.0000)

0.05

0.0338 (0.0982)

0.4306** (0.0001)

−5.8923** (0.0000)

DHM2

−6.8503** (0.0000)

0.4645** (0.0000)

−5.7592** (0.0000)

DHM1

0.35

SIGMA

Estimated Coefficient

Intercept

Quantiles

8 −2.2817** (0.0000)

−1.4673** (0.0000)

−3.2690**

−1.5774** (0.0000)

−1.0233** (0.0000)

−2.0744**

−1.0939** (0.0000)

−0.7642** (0.0000)

−0.5371** (0.0000)

−0.3786** (0.0000)

−0.2665** (0.0000)

−0.1863** (0.0016)

−0.1285* (0.0294)

−0.0867 (0.1411)

−0.0566 (0.3361)

−0.0352 (0.5501)

−0.0202 (0.7310)

−0.0102 (0.8617)

−0.0041 (0.9443)

−0.0009 (0.9874)

−1.1405** (0.0000)

−1.0487** (0.0001)

NIMTA

−0.7152** (0.0000)

−0.5029** (0.0000)

−0.3550** (0.0000)

−0.2508** (0.0000)

−0.1766** (0.0000)

−0.1233** (0.0023)

−0.0849* (0.0355)

−0.0571 (0.1571)

−0.0371 (0.3573)

−0.0229 (0.5691)

−0.0131 (0.7444)

−0.0066 (0.8692)

−0.0027 (0.9475)

−0.0006 (0.9882)

−0.5370* (0.0338)

−0.5593* (0.0276)

EXRETURN

2.6103**

1.7174** (0.0000)

1.1749** (0.0000)

0.8273** (0.0000)

0.5945** (0.0000)

0.4328** (0.0000)

0.3172** (0.0000)

0.2325** (0.0000)

0.1693** (0.0000)

0.1214** (0.0009)

0.0851* (0.0205)

0.0575 (0.1177)

0.0368 (0.3170)

0.0217 (0.5554)

0.0112 (0.7607)

0.0046 (0.9015)

0.0010 (0.9775)

2.2131** (0.0000)

1.8678** (0.0000)

LTMTA

Table 5A Estimated coefficients’ results for discrete hazard model and quantile hazard model (2000–2014).

−0.9120**

−0.7308** (0.0000)

−0.5585** (0.0000)

−0.4201** (0.0000)

−0.3146** (0.0000)

−0.2343** (0.0000)

−0.1731** (0.0010)

−0.1265* (0.0166)

−0.0912 (0.0845)

−0.0644 (0.2234)

−0.0442 (0.4031)

−0.0293 (0.5804)

−0.0183 (0.7291)

−0.0106 (0.8412)

−0.0054 (0.9189)

−0.0022 (0.9673)

−0.0005 (0.9926)

−0.4999 (0.0609)

−0.3666 (0.1700)

CASHMTA

0.0151**

0.0304** (0.0000)

0.0285** (0.0001)

0.0232** (0.0016)

0.0178* (0.0158)

0.0133 (0.0730)

0.0097 (0.1912)

0.0070 (0.3494)

0.0049 (0.5115)

0.0033 (0.6530)

0.0022 (0.7651)

0.0014 (0.8486)

0.0009 (0.9079)

0.0005 (0.9483)

0.0002 (0.9744)

0.0001 (0.9899)

0.0000 (0.9977)

−0.0328 (0.3189)

−0.0136 (0.6732)

RSIZE

−1.1270**

−0.7872** (0.0000)

−0.5284** (0.0000)

−0.3548** (0.0000)

−0.2406** (0.0000)

−0.1646** (0.0000)

−0.1133** (0.0000)

−0.0781** (0.0000)

−0.0536** (0.0001)

−0.0364** (0.0065)

−0.0242 (0.0698)

−0.0156 (0.2417)

−0.0096 (0.4712)

−0.0055 (0.6807)

−0.0028 (0.8356)

−0.0011 (0.9339)

−0.0002 (0.9851)

−0.4100** (0.0000)

−0.4689** (0.0000)

PRICE

0.3233**

0.2976** (0.0000)

0.2441** (0.0000)

0.1962** (0.0003)

0.1565** (0.0038)

0.1239* (0.0220)

0.0973 (0.0720)

0.0756 (0.1620)

0.0578 (0.2848)

0.0433 (0.4235)

0.0314 (0.5615)

0.0218 (0.6866)

0.0143 (0.7919)

0.0085 (0.8744)

0.0045 (0.9342)

0.0018 (0.9730)

0.0004 (0.9938)

0.7461* (0.0118)

0.0638 (0.8056)

LCTAT

−0.6162**

−0.4212** (0.0000)

−0.3136** (0.0000)

−0.2400** (0.0000)

−0.1857** (0.0000)

−0.1439** (0.0000)

−0.1109** (0.0005)

−0.0847** (0.0077)

−0.0637* (0.0454)

−0.0467 (0.1419)

−0.0333 (0.2961)

−0.0227 (0.4763)

−0.0146 (0.6471)

−0.0086 (0.7870)

−0.0044 (0.8891)

−0.0018 (0.9549)

−0.0004 (0.9897)

−0.8411** (0.0000)

D1

−0.1102** (0.0000)

−0.0720** (0.0008)

−0.0504* (0.0191)

−0.0363 (0.0914)

−0.0265 (0.2186)

−0.0193 (0.3688)

−0.0141 (0.5132)

−0.0101 (0.6384)

−0.0071 (0.7403)

−0.0049 (0.8199)

−0.0032 (0.8804)

−0.0020 (0.9250)

−0.0012 (0.9566)

−0.0006 (0.9779)

−0.0002 (0.9911)

−0.0001 (0.9980)

−0.0618 (0.5878)

D3

0.3129** −0.2026** (continued on next page)

0.1760** (0.0000)

0.0956** (0.0013)

0.0517 (0.0816)

0.0278 (0.3489)

0.0145 (0.6240)

0.0072 (0.8080)

0.0032 (0.9134)

0.0012 (0.9683)

0.0002 (0.9940)

−0.0002 (0.9958)

−0.0002 (0.9937)

−0.0002 (0.9947)

−0.0001 (0.9966)

−0.0001 (0.9982)

0.0000 (0.9993)

0.0000 (0.9998)

0.0641 (0.6854)

D2

M.C. Dong et al.

North American Journal of Economics and Finance xxx (xxxx) xxx–xxx

2.4141** (0.0000)

−5.1012** (0.0000)

0.95

**

−4.9406** (0.0000)

−4.3965 (0.0000)

−2.7135 (0.0000) −2.7527** (0.0000)

(0.0000) **

NIMTA

(0.0000)

EXRETURN

5.5216** (0.0000)

4.1105 (0.0000)

**

(0.0000)

LTMTA

Notes: The p-values are in the parentheses. * and ** denote significance at 5% and 1%, respectively. 1 Is the estimated coefficients of the discrete hazard model without industrial dummy variables. 2 Is the estimated coefficients of the discrete hazard models with industrial dummy variables.

2.0258 (0.0000)

−3.6726 (0.0000)

0.9

(0.0000)

(0.0000)

**

SIGMA

Intercept

Estimated Coefficient

**

Quantiles

Table 5A (continued)

1.0824** (0.0000)

−0.7831 (0.0000)

(0.0000) **

CASHMTA

**

−0.3805** (0.0000)

−0.0749 (0.0000)

(0.0255)

RSIZE

**

−0.8208** (0.0000)

−1.3507 (0.0000)

(0.0000)

PRICE

0.1464** (0.0003)

0.2126 (0.0000)

**

(0.0000)

LCTAT

**

−2.6841** (0.0000)

−1.1992 (0.0000)

(0.0000)

D1

−0.1345** (0.0000)

0.3750 (0.0000)

**

(0.0000)

D2

−1.0048** (0.0000)

−0.5282** (0.0000)

(0.0000)

D3

M.C. Dong et al.

North American Journal of Economics and Finance xxx (xxxx) xxx–xxx

9

North American Journal of Economics and Finance xxx (xxxx) xxx–xxx

M.C. Dong et al.

The estimation of dummy variables shows evidence of the distinctions between industry sectors in terms of default risk. A negative coefficient on a dummy variable illustrates that the corresponding industry’s default probability is, on average, lower than that of the base industry (Chava & Jarrow, 2004). Examining the signs and magnitudes of the three industrial dummy variables from the DHM(2) model fitting results, we observe a significantly lower default risk on average for the finance, insurance and real estate industry group (industry 1) compared to the baseline miscellaneous industry (industry 4). However, there are no significant differences in the unconditional default probabilities among the transportation, communications and utility (industry 2) and manufacturing and mineral industries groups (industry 3) to the baseline industry. Our unreported partial F-test result confirms the appropriateness of including an industrial dummy variable in our model. The remaining entries in Table 5A summarize the results from fitting our proposed QHM approach with each row corresponding to a quantile location. In estimating QHM, we include the three dummy variables to examine the changes of industrial effects throughout all the quantile levels. We find that the significance and the magnitude of those three dummy variables change across quantile levels. Specifically, we see that the three industry dummy variables become statistically significant at the 75th quantile level and higher. At the median level, we observe the significant dummy variable D1 and insignificant dummy variables D2 and D3, which are similar to the DHM model’s results. This allows us to promote our proposed QHM to uncover the full picture between the industry effects and the default risk at different quantile levels. Furthermore, we find that the default risk is highest for the transportation, communications, and utility industry group, followed by the miscellaneous industry group, the manufacturing and mineral industry group, and the finance, insurance, and real estate industry group at most quantile levels below the 95th quantile. However, this order changes to the miscellaneous industry group, followed by the transportation, communications, and utility industry group, the manufacturing and mineral industry group, and their finance, insurance, and real estate industry group at the 95th quantile level. Such results provide us with supporting evidence that the QHM specification is able to capture such a special “tail property”. In summary, this difference of default risk probability in different industrial sectors across a variety of quantile levels helps shed light on the issue of portfolio diversification in risk management for global investors. The five predictor variables that are significant in DHM demonstrate their increasingly important predictive ability on a default event at higher quantile levels in QHM. In particular, the loadings to stock volatility (SIGMA), stock excess return (EXRETURN), the profitability ratio of net income over total assets (NIMTA), the liability ratio of total liability over total assets (LTMTA), and the log of stock price (PRICE) all increase with the expected signs. This proves the fact that those factors become more and more important in the high-risk companies group. More specifically, when a company is already having difficulty, risky signs such as high stock volatility and liability-over-assets push the company to the brink of bankruptcy more quickly. In contrast, for those above companies, safety signs such as high excess return, stock price, and net income over total assets help to rescue them from a recession and increase their business performance faster than low-risk companies. The three variables of CASHMTA, RSIZE, and LCTAT that are insignificant in DHM exhibit their significant effects in high quantile levels when using QHM. For LCTAT, it enters the list of factors affecting default risk with positive coefficients from the 60th percentile, illustrating that the current liabilities ratio are also an important indicators when considering the risk of failure. Current liabilities, which are debts and obligations due within one year, usually bring a lot of pressure to companies in general and even more pressure to bad performance companies in particular. Especially for companies with high default risk, a large amount of debt that has to be paid in a short time period even brings them more trouble and makes them go bankrupt more quickly. Two other variables, CASHMTA and RSIZE, not only show their significant effects on failure risk, but also exhibit sign switching when moving along high quantile levels. More specifically, CASHMTA presents effects with a negative sign from the 50th to 90th percentile, but suddenly changes to a positive sign in the 95th percentile. This means that a high cash and short-term investment ratio could be a good sign for ordinary companies, but it is a threat to very high default risk companies. The former opinion is very reasonable when saying that CASHMTA is a good sign since this variable presents the proportion of cash and short-term investments in marketable securities, which can quickly and easily be converted into cash over total assets. A higher ratio of CASHMTA means the company has more liquidity assets and proves that it has a strong financial condition. However, it does not tell the whole story. A high cash and short-term investment ratio may indicate inefficiency in the utilization of cash or that a firm is not maximizing the potential benefit of low-cost loans. Moreover, unused cash and low benefits from short-term investment may not create enough returns to cover debts and liabilities. Therefore, too high CASHMTA could also be a sign of crisis in high risk companies. In contrast, RSIZE shows its effects with a positive sign from the 65th to 85th percentiles, but changes to a negative sign from the 90th percentile. This means that for companies under high risk, the higher their market capitalization or total value in the stock market is, the greater the chance is that they can recover from a crisis. The finding is reasonable, because companies with a bigger market size usually bring more confidence to the market during a recession than do small ones; investors typically will not flee from those larger companies during bad economic times. Since traders typically discount the market equity of firms that are close to default, market capitalization is also crucial to predicting default rates From those results above, we see that many important indicators ignored by DHM come to light thanks to using QHM, which proves the worth of examining different quantile levels when investigating default risk. Note that in the low quantile levels all coefficient estimates are not statistically significant in predicting a default event. This result is not surprising, because the distribution of defaults is heavily skewed to operating companies (which take the value of 0). A failure event is rather rare. In the middle quantile levels, the QHM results are very similar to DHM. For most predictor variables, a higher level of quantiles implies higher absolute

10

North American Journal of Economics and Finance xxx (xxxx) xxx–xxx

M.C. Dong et al.

Table 5B Goodness-of-fit results for discrete hazard model and quantile hazard model (2000–2014). Quantiles

Goodness-of-fit

1

DHM DHM2 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45

Quantiles

Pseudo-R2

AUC

0.1125 0.1184 0.2493 0.2413 0.2334 0.2257 0.2181 0.2106 0.2032 0.1960 0.1889

0.8220 0.8259 0.8331 0.8186 0.8336 0.8341 0.8347 0.8353 0.8360 0.8366 0.8372

Goodness-of-fit

0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95

Pseudo-R2

AUC

0.1821 0.1755 0.1694 0.1638 0.1593 0.1561 0.1553 0.1574 0.1595 0.1457

0.8378 0.8382 0.8386 0.8389 0.8391 0.8393 0.8394 0.8398 0.8406 0.8392

Notes: 1 Is the estimated coefficients of the discrete hazard model without industrial dummy variables. 2 Is the estimated coefficients of the discrete hazard models with industrial dummy variables.

values of parameters, because the distribution of default companies rises when the quantile levels increase, making the influence of predictor variables turn stronger. For the model comparison, we use two popular measures: (1) the area under the receiver operating characteristic curve (AUC) and (2) McFadden’s pseudo-R2. To evaluate the estimation of binary models, a widely used method is the receiver operating characteristic curve, which is the plot of the true positive rate (TPR) versus the false positive rate (FPR) by varying the threshold. Larger AUC values (or larger area under the receiver operating characteristic curve) indicate on average a better classification performance (Bradley, 1997; Hosmer, Lemeshow, & Sturdivant, 2013). Here, AUC takes the values from 0.5 to 1, and the model with the highest ratio is the most desirable one. McFadden’s pseudo-R2 (McFadden, 1974) is a log-likelihood based information measure. It is equal to one minus the log-likelihood ratio of the fitted model over the intercept-only model. A model with a higher McFadden’s pseudo-R2 value is more preferable. The goodness-of-fit statistics presented in the last two columns of Table 5B show that the QHM model with high quantile Table 6 P-values for the test of the equality between QHM coefficients and DHM coefficients (2000–2014). Variables

Quantile levels 0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

SIGMA EXRETURN NIMTA LTMTA CASHMTA RSIZE PRICE LCTAT D1 D2 D3

0.0001 0.0363 0.0000 0.0000 0.0662 0.3306 0.0000 0.0133 0.0000 0.6906 0.5947

0.0001 0.0371 0.0000 0.0000 0.0671 0.3296 0.0000 0.0135 0.0000 0.6906 0.5954

0.0001 0.0385 0.0000 0.0000 0.0689 0.3276 0.0000 0.0138 0.0000 0.6899 0.5975

0.0001 0.0410 0.0000 0.0000 0.0719 0.3241 0.0000 0.0143 0.0000 0.6899 0.6010

0.0002 0.0449 0.0001 0.0000 0.0766 0.3183 0.0000 0.0151 0.0000 0.6899 0.6066

0.0002 0.0511 0.0001 0.0000 0.0835 0.3106 0.0000 0.0162 0.0000 0.6892 0.6136

0.0004 0.0611 0.0001 0.0000 0.0937 0.2993 0.0000 0.0176 0.0000 0.6899 0.6241

0.0006 0.0777 0.0002 0.0000 0.1092 0.2842 0.0000 0.0196 0.0000 0.6914 0.6376

0.0014 0.1065 0.0005 0.0000 0.1328 0.2640 0.0000 0.0223 0.0000 0.6958 0.6556

0.0037 0.1597 0.0014 0.0000 0.1698 0.2388 0.0000 0.0260 0.0000 0.7054 0.6811

Variables

Quantile levels

SIGMA EXRETURN NIMTA LTMTA CASHMTA RSIZE PRICE LCTAT D1 D2 D3

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

0.0134 0.2640 0.0053 0.0000 0.2294 0.2077 0.0000 0.0312 0.0000 0.7241 0.7144

0.0637 0.4777 0.0274 0.0000 0.3286 0.1719 0.0004 0.0388 0.0001 0.7581 0.7611

0.3426 0.8942 0.1691 0.0000 0.4952 0.1336 0.0136 0.0503 0.0002 0.8220 0.8267

0.6899 0.4871 0.8650 0.0000 0.7695 0.0971 0.4214 0.0679 0.0006 0.9386 0.9219

0.0151 0.0580 0.1109 0.0000 0.8290 0.0689 0.0849 0.0955 0.0026 0.8446 0.9299

0.0000 0.0003 0.0000 0.0166 0.3953 0.0605 0.0000 0.1365 0.0165 0.4871 0.6767

0.0000 0.0000 0.0000 0.0549 0.1288 0.1539 0.0000 0.1603 0.1988 0.1221 0.2240

0.0000 0.0000 0.0000 0.0000 0.2951 0.2084 0.0000 0.0759 0.0404 0.0530 0.0001

0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0450 0.0000 0.2150 0.0000

11

North American Journal of Economics and Finance xxx (xxxx) xxx–xxx

M.C. Dong et al.

Fig. 2. Estimated QHM coefficients for various quantile levels (2000–2014). Notes: The dots are coefficients of each variable associated to each quantile level estimated by QHM. The solid lines indicate the estimated coefficients of each variable estimated by DHM and the dashed lines indicate the 95% confidence interval of DHM’s coefficients.

levels has higher McFadden’s pseudo-R2 than does the DHM model. This indicates that the former has less information loss and thus provides a better fit for the bankruptcy data. Similarly, with a higher AUC value, the QHM model with high quantile levels conveys better discriminatory ability than does the DHM model. To clarify the differences between QHM’s estimated coefficients and the corresponding DHM’s estimated coefficients, we exploit the statistical test for the equality of regression coefficients (Clogg, Petkova, & Haritou, 1995; Paternoster, Brame, Mazerolle, & Piquero, 1998). Those studies suggest computing the z-statistics:

z=

βiτ −βiDHM SE (βiτ )2 + SE (βiDHM)2

,

under the null hypothesis H0: βiτ = βiDHM, where βiτ is the coefficient of predicted variable x i in quantile level τ under the QHM

12

North American Journal of Economics and Finance xxx (xxxx) xxx–xxx

M.C. Dong et al.

Table 7 AUC values of one-year-ahead out-of-sample predictive performance for the period 2010–2014. Quantile

2010

2011

2012

2013

2014

DHM 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95

0.8535 0.8434 0.8365 0.8431 0.8429 0.8425 0.8422 0.8417 0.8414 0.8414 0.8416 0.8419 0.8421 0.8425 0.8434 0.8446 0.8460 0.8477 0.8516 0.8537

0.8334 0.8479 0.8267 0.8482 0.8486 0.8488 0.8495 0.8503 0.8507 0.8510 0.8509 0.8507 0.8506 0.8506 0.8498 0.8490 0.8481 0.8481 0.8478 0.8508

0.9623 0.9587 0.9457 0.9583 0.9583 0.9578 0.9571 0.9566 0.9556 0.9547 0.9537 0.9527 0.9516 0.9504 0.9490 0.9479 0.9474 0.9488 0.9520 0.9630

0.9405 0.9496 0.9335 0.9497 0.9498 0.9498 0.9495 0.9495 0.9495 0.9490 0.9480 0.9471 0.9462 0.9450 0.9436 0.9417 0.9403 0.9393 0.9397 0.9389

0.8746 0.8765 0.8678 0.8772 0.8774 0.8781 0.8789 0.8795 0.8802 0.8806 0.8809 0.8814 0.8816 0.8821 0.8822 0.8826 0.8832 0.8842 0.8848 0.8813

Note: The highest AUC values of each forecasting year are marked in bold. Table 8 Brier score values of one-year-ahead out-of-sample predictive performance for the period 2010–2014 (values are multiplied by 10). Quantile

2010

2011

2012

2013

2014

DHM 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95

0.0480 0.0483 0.0483 0.0483 0.0483 0.0483 0.0483 0.0483 0.0483 0.0483 0.0483 0.0483 0.0482 0.0482 0.0480 0.0477 0.0509 0.1433 0.6639 3.5557

0.0525 0.0526 0.0526 0.0526 0.0526 0.0526 0.0526 0.0526 0.0526 0.0526 0.0526 0.0526 0.0525 0.0525 0.0523 0.0520 0.0615 0.2179 0.9867 4.1844

0.0403 0.0403 0.0403 0.0403 0.0403 0.0403 0.0403 0.0403 0.0403 0.0403 0.0403 0.0403 0.0402 0.0401 0.0399 0.0393 0.0448 0.1552 0.6993 3.2824

0.0298 0.0298 0.0298 0.0298 0.0298 0.0298 0.0298 0.0298 0.0298 0.0298 0.0298 0.0298 0.0298 0.0297 0.0296 0.0295 0.0319 0.0850 0.4571 2.6009

0.0558 0.0559 0.0559 0.0559 0.0559 0.0559 0.0559 0.0559 0.0559 0.0559 0.0559 0.0559 0.0558 0.0558 0.0556 0.0555 0.0619 0.1557 0.5730 2.5958

Note: The lowest BS values of each forecasting year are marked in bold.

approach, and βiDHM is the coefficient of predicted variable x i under the DHM approach. SE (βiτ ) and SE (βiDHM) are their standard errors. Table 6 reports the p-value results for each entry between our model comparisons, which suggest that the differences between the QHM coefficients and the DHM coefficients occur mostly in the high quantile levels. For example, in the 90th quantile level, all eight QHM coefficients are significantly different from the DHM coefficients, lending support to the use of QHM. In addition, we depict QHM’s estimated parameters in various quantile levels using dots with the 95% confidence interval of the corresponding DHM estimated parameters in Fig. 2. We find it is rather rare that the parameter estimates of the QHM model at different quantile levels would fall into the 95% confidence interval of the corresponding DHM estimated parameters, confirming the distinction between the coefficients estimated by QHM and DHM. For robustness check, we further estimate the default risk using DHM and QHM over two other time periods: 1980 to 1989 and 1990 to 1999. The results are qualitatively similar, as QHM consistently outperforms DHM over different sampling periods. The

13

North American Journal of Economics and Finance xxx (xxxx) xxx–xxx

M.C. Dong et al.

Appendix reports the detailed results. We now measure our models’ predictive ability by evaluating the out-of-sample forecasting performance of DHM and across different quantiles of QHM. We first calculate the parameters in the training period (2000–2009) and produce a one-step-ahead forecast for the testing period (year 2010). Next, we adopt the rolling window approach to make a forecast for further testing of the dataset from 2011 to 2014. We evaluate the accuracy of predictive ability using both (1) AUC and (2) Brier score (BS). The AUC performs very well in practice and is often used when preferring a general measure of prediction performance (Fawcett, 2006). Table 7 reports the AUC values for the one-step-ahead forecasting performance. We easily recognize that DHM never presents the best predictive performance for default risk estimation. The highest AUC values of the testing periods appear when using QHM at high quantile levels. This result lends support for carefully examining different quantile levels when studying default risk. Aside from AUC, the literature has widely used BS for measuring the probabilistic prediction accuracy of a binary outcome, N calculated by the equation BS = N−1 ∑n = 1 (yi,t̂ + 1−yi,t + 1)2 , where yi,t + 1 is the observed default outcome of company i at time t + 1, yi,t̂ + 1 is the corresponding estimated default probability, and N is the total number of firm years in the testing periods. The lowest BS value indicates the best predictive performance. Table 8 exhibits the BS values for the one-step-ahead forecasting performance. Once again, it releases evidence that only using DHM is not good enough for predicting default risk, and it is worth taking different quantile levels into consideration. More interestingly, according to the BS criteria, the best predictive performance in each forecasting year entirely falls in the quantile levels of 0.75. This may suggest that the 75th quantile level may yield the best prediction performance when examining companies’ failure risk. Moreover, in all testing periods the BS values increase significantly after the optimal 75th quantile prediction level, indicating that the predictive accuracy drops quickly afterwards. Therefore, choosing the right prediction quantile level is equally crucial for an accurate predictive performance. 5. Conclusion This research extends existing studies in the literature about financial ratios and firms’ default risk by using QHM and comparing the findings to DHM. The DHM results indicate the bad effects of utilizing high stock return volatilities and high liabilities on a company’s failure risk, while high stock return and high profitability ratio help mitigate this hazard. By adopting the proposed QHM, we further present at high quantile levels of default outcome (companies with high failure risk) the important predictive roles of the liquidity ratio of cash and short-term investment to total assets, market capitalization, and current liabilities-to-total assets ratio, which DHM ignores. Moreover, the QHM results also illustrate the differences of default probability between industrial sectors in different quantile levels, thus bringing useful information for investors looking to diversify their portfolios. Such findings show that the proposed QHM method presents a more comprehensive view by examining the degree of financial ratios’ effect over varying quantile levels, thus offering more accurate results. The improved out-of-sample forecasting results also prove the advantage of using our proposed QHM. Acknowledgements The authors thank the editors and referee for their precious time and valuable comments to improve the quality of this paper. Cathy W.S. Chen’s research is funded by the Ministry of Science and Technology, Taiwan (MOST 105-2118-M-035-003-MY2). Appendix

14

15

SIGMA

0.2735* (0.0140) 0.2766* (0.0131) 0.0002 (0.9945) 0.0011 (0.9755) 0.0027 (0.9385) 0.0054 (0.8778) 0.0095 (0.7871) 0.0154 (0.6605) 0.0238 (0.4971) 0.0356 (0.3109) 0.0520 (0.1401) 0.0746* (0.0348) 0.1059** (0.0029) 0.1493** (0.0000) 0.2097** (0.0000) 0.2933** (0.0000) 0.4067** (0.0000) 0.5503** (0.0000) 0.6885** (0.0000) 0.7402** (0.0000) 0.5438** (0.0000)

INTERCEPT

−6.8161** (0.0000) −6.7674** (0.0000) −6.9089** (0.0000) −6.9094** (0.0000) −6.9103** (0.0000) −6.9115** (0.0000) −6.9130** (0.0000) −6.9144** (0.0000) −6.9152** (0.0000) −6.9146** (0.0000) −6.9113** (0.0000) −6.9034** (0.0000) −6.8876** (0.0000) −6.8589** (0.0000) −6.8091** (0.0000) −6.7278** (0.0000) −6.6062** (0.0000) −6.4724** (0.0000) −6.5597** (0.0000) −7.3300** (0.0000) −4.5815** (0.0000)

Estimated Coefficient

0.0347 (0.9021) 0.0253 (0.9285) 0.0000 (0.9995) −0.0002 (0.9976) −0.0005 (0.9939) −0.0011 (0.9877) −0.0019 (0.9780) −0.0032 (0.9634) −0.0051 (0.9419) −0.0079 (0.9108) −0.0119 (0.8662) −0.0177 (0.8032) −0.0259 (0.7156) −0.0377 (0.5971) −0.0549 (0.4454) −0.0790 (0.2766) −0.1103 (0.1346) −0.1379 (0.0674) −0.1187 (0.1267) −0.0732 (0.3570) −0.1765* (0.0250)

EXRETURN −0.9887** (0.0014) −1.0028** (0.0012) −0.0014 (0.9876) −0.0063 (0.9451) −0.0159 (0.8624) −0.0317 (0.7294) −0.0560 (0.5417) −0.0918 (0.3183) −0.1431 (0.1205) −0.2158* (0.0196) −0.3181** (0.0006) −0.4617** (0.0000) −0.6634** (0.0000) −0.9463** (0.0000) −1.3412** (0.0000) −1.8833** (0.0000) −2.5937** (0.0000) −3.4057** (0.0000) −3.9835** (0.0000) −3.8745** (0.0000) −2.4787** (0.0000)

NIMTA 1.3384** (0.0000) 1.2970** (0.0000) 0.0008 (0.9883) 0.0035 (0.9482) 0.0087 (0.8713) 0.0172 (0.7490) 0.0299 (0.5783) 0.0480 (0.3726) 0.0729 (0.1751) 0.1069* (0.0468) 0.1527** (0.0045) 0.2146** (0.0001) 0.2988** (0.0000) 0.4145** (0.0000) 0.5762** (0.0000) 0.8063** (0.0000) 1.1402** (0.0000) 1.6278** (0.0000) 2.2475** (0.0000) 2.3906** (0.0000) 1.1147** (0.0000)

LTMTA

Notes: The p-values are in the parentheses. * and ** denote significance at 5% and 1%, respectively. 1 Is the estimated coefficient of the discrete hazard model without industrial dummy variables. 2 Is the estimated coefficient of the discrete hazard model with industrial dummy variables.

0.95

0.9

0.85

0.8

0.75

0.7

0.65

0.6

0.55

0.5

0.45

0.4

0.35

0.3

0.25

0.2

0.15

0.1

0.05

DHM2

DHM1

Quantiles

Table A1 Estimated coefficients’ results of discrete hazard model and quantile hazard model (1980–1989).

−1.3556** (0.0044) −1.4224** (0.0030) −0.0008 (0.9942) −0.0037 (0.9745) −0.0093 (0.9365) −0.0182 (0.8757) −0.0314 (0.7874) −0.0499 (0.6683) −0.0748 (0.5202) −0.1076 (0.3537) −0.1504 (0.1939) −0.2060 (0.0744) −0.2781* (0.0155) −0.3728** (0.0011) −0.4989** (0.0000) −0.6702** (0.0000) −0.9094** (0.0000) −1.2735** (0.0000) −1.9084** (0.0000) −3.0854** (0.0000) −2.5353** (0.0000)

CASHMTA −0.1379** (0.0007) −0.1377** (0.0008) 0.0000 (0.9986) −0.0001 (0.9939) −0.0002 (0.9850) −0.0005 (0.9705) −0.0008 (0.9492) −0.0013 (0.9192) −0.0019 (0.8777) −0.0028 (0.8209) −0.0041 (0.7428) −0.0060 (0.6352) −0.0087 (0.4897) −0.0128 (0.3062) −0.0195 (0.1191) −0.0309* (0.0130) −0.0521** (0.0000) −0.0961** (0.0000) −0.2069** (0.0000) −0.4873** (0.0000) −0.6413** (0.0000)

RSIZE −0.2805** (0.0000) −0.2802** (0.0000) −0.0002 (0.9906) −0.0009 (0.9583) −0.0022 (0.8954) −0.0045 (0.7929) −0.0079 (0.6433) −0.0129 (0.4487) −0.0200 (0.2380) −0.0302 (0.0752) −0.0446** (0.0086) −0.0651** (0.0001) −0.0945** (0.0000) −0.1373** (0.0000) −0.2005** (0.0000) −0.2951** (0.0000) −0.4374** (0.0000) −0.6430** (0.0000) −0.8811** (0.0000) −0.9066** (0.0000) −0.4251** (0.0000)

PRICE 1.1564** (0.0000) 1.1895** (0.0000) 0.0009 (0.9888) 0.0042 (0.9501) 0.0105 (0.8752) 0.0210 (0.7543) 0.0369 (0.5818) 0.0603 (0.3690) 0.0937 (0.1634) 0.1407* (0.0365) 0.2067** (0.0022) 0.2996** (0.0000) 0.4309** (0.0000) 0.6186** (0.0000) 0.8900** (0.0000) 1.2865** (0.0000) 1.8673** (0.0000) 2.6996** (0.0000) 3.7501** (0.0000) 4.3371** (0.0000) 3.0755** (0.0000)

LCTAT

0.1109 (0.5157) 0.0001 (0.9991) 0.0003 (0.9959) 0.0006 (0.9897) 0.0012 (0.9796) 0.0022 (0.9641) 0.0036 (0.9414) 0.0056 (0.9086) 0.0085 0.8613) 0.0127 (0.7927) 0.0190 (0.6934) 0.0286 (0.5522) 0.0436 (0.3636) 0.0674 (0.1566) 0.1060* (0.0245) 0.1696** (0.0003) 0.2797** (0.0000) 0.5137** (0.0000) 0.9982** (0.0000) 1.1472** (0.0000)

D1

−0.1404 (0.5115) 0.0000 (0.9998) −0.0001 (0.9993) −0.0001 (0.9982) −0.0003 (0.9963) −0.0005 (0.9931) −0.0009 (0.9882) −0.0015 (0.9807) −0.0023 (0.9693) −0.0036 (0.9524) −0.0055 (0.9271) −0.0085 (0.8885) −0.0131 (0.8281) −0.0208 (0.7300) −0.0346 (0.5656) −0.0609 (0.3087) −0.1172* (0.0468) −0.2725** (0.0000) −0.5738** (0.0000) −0.6220** (0.0000)

D2

−0.0692 (0.5244) 0.0000 (1.0000) 0.0000 (0.9999) 0.0000 (0.9995) 0.0000 (0.9987) 0.0001 (0.9970) 0.0002 (0.9936) 0.0005 (0.9876) 0.0009 (0.9776) 0.0015 (0.9622) 0.0023 (0.9408) 0.0033 (0.9146) 0.0042 (0.8900) 0.0043 (0.8891) 0.0008 (0.9784) −0.0127 (0.6794) −0.0549 (0.0714) −0.1918** (0.0000) −0.4988** (0.0000) −0.5590** (0.0000)

D3

M.C. Dong et al.

North American Journal of Economics and Finance xxx (xxxx) xxx–xxx

16

SIGMA

0.1758* (0.0488) 0.1506 (0.0923) 0.0002 (0.9940) 0.0009 (0.9732) 0.0022 (0.9325) 0.0044 (0.8651) 0.0078 (0.7632) 0.0128 (0.6206) 0.0200 (0.4396) 0.0302 (0.2444) 0.0444 (0.0877) 0.0641* (0.0140) 0.0915** (0.0005) 0.1298** (0.0000) 0.1835** (0.0000) 0.2594** (0.0000) 0.3673** (0.0000) 0.5190** (0.0000) 0.7269** (0.0000) 1.0721** (0.0000) 1.2645** (0.0000)

INTERCEPT

−5.5239** (0.0000) −5.2560** (0.0000) −6.9077** (0.0000) −6.9039** (0.0000) −6.8963** (0.0000) −6.8833** (0.0000) −6.8630** (0.0000) −6.8323** (0.0000) −6.7875** (0.0000) −6.7233** (0.0000) −6.6321** (0.0000) −6.5030** (0.0000) −6.3204** (0.0000) −6.0611** (0.0000) −5.6917** (0.0000) −5.1673** (0.0000) −4.4389** (0.0000) −3.5111** (0.0000) −2.6896** (0.0000) −3.3320** (0.0000) −3.3379** (0.0000)

Estimated Coefficient

0.1324 (0.5253) 0.1614 (0.4392) 0.0001 (0.9990) 0.0003 (0.9957) 0.0007 (0.9897) 0.0013 (0.9810) 0.0021 (0.9696) 0.0031 (0.9556) 0.0042 (0.9398) 0.0054 (0.9226) 0.0066 (0.9054) 0.0077 (0.8892) 0.0087 (0.8765) 0.0088 (0.8745) 0.0067 (0.9051) −0.0006 (0.9922) −0.0201 (0.7238) −0.0671 (0.2417) −0.1689** (0.0030) −0.4778** (0.0000) −0.6693** (0.0000)

EXRETURN −1.2029** (0.0000) −1.2741** (0.0000) −0.0019 (0.9790) −0.0083 (0.9072) −0.0208 (0.7713) −0.0410 (0.5661) −0.0715 (0.3180) −0.1153 (0.1079) −0.1767* (0.0139) −0.2619** (0.0003) −0.3794** (0.0000) −0.5410** (0.0000) −0.7635** (0.0000) −1.0696** (0.0000) −1.4877** (0.0000) −2.0494** (0.0000) −2.7760** (0.0000) −3.6311** (0.0000) −4.4109** (0.0000) −5.0244** (0.0000) −4.1048** (0.0000)

NIMTA 1.6471** (0.0000) 1.7432** (0.0000) 0.0014 (0.9753) 0.0063 (0.8919) 0.0155 (0.7386) 0.0300 (0.5196) 0.0507 (0.2760) 0.0789 (0.0895) 0.1165* (0.0121) 0.1659** (0.0004) 0.2307** (0.0000) 0.3159** (0.0000) 0.4291** (0.0000) 0.5816** (0.0000) 0.7899** (0.0000) 1.0787** (0.0000) 1.4851** (0.0000) 2.0580** (0.0000) 2.8483** (0.0000) 3.6725** (0.0000) 2.6818** (0.0000)

LTMTA

Notes: The p-values are in the parentheses. * and ** denote significance at 5% and 1%, respectively. 1 Is the estimated coefficient of the discrete hazard model without industrial dummy variables. 2 Is the estimated coefficient of the discrete hazard model with industrial dummy variables.

0.95

0.9

0.85

0.8

0.75

0.7

0.65

0.6

0.55

0.5

0.45

0.4

0.35

0.3

0.25

0.2

0.15

0.1

0.05

DHM2

DHM1

Quantiles

Table A2 Estimated coefficients’ results of discrete hazard model and quantile hazard model (1990–1999).

−0.8757** (0.0077) −0.8893** (0.0068) −0.0010 (0.9922) −0.0042 (0.9655) −0.0104 (0.9151) −0.0202 (0.8362) −0.0343 (0.7254) −0.0535 (0.5830) −0.0790 (0.4171) −0.1122 (0.2490) −0.1546 (0.1116) −0.2087* (0.0315) −0.2773** (0.0042) −0.3645** (0.0002) −0.4752** (0.0000) −0.6170** (0.0000) −0.8036** (0.0000) −1.0563** (0.0000) −1.4451** (0.0000) −2.3284** (0.0000) −2.5932** (0.0000)

CASHMTA −0.0473 (0.1571) −0.0462 (0.1713) 0.0000 (0.9967) 0.0002 (0.9852) 0.0006 (0.9626) 0.0011 (0.9255) 0.0019 (0.8696) 0.0031 (0.7904) 0.0048 (0.6843) 0.0070 (0.5509) 0.0100 (0.3971) 0.0138 (0.2412) 0.0187 (0.1126) 0.0247* (0.0354) 0.0319** (0.0066) 0.0394** (0.0008) 0.0447** (0.0001) 0.0383** (0.0009) −0.0122** (0.2745) −0.2171** (0.0000) −0.5307** (0.0000)

RSIZE −0.5368** (0.0000) −0.5276** (0.0000) −0.0005 (0.9766) −0.0022 (0.8968) −0.0054 (0.7467) −0.0106 (0.5243) −0.0184 (0.2686) −0.0296 (0.0750) −0.0454** (0.0064) −0.0674** (0.0001) −0.0979** (0.0000) −0.1406** (0.0000) −0.2007** (0.0000) −0.2861** (0.0000) −0.4089** (0.0000) −0.5861** (0.0000) −0.8391** (0.0000) −1.1788** (0.0000) −1.5454** (0.0000) −1.6392** (0.0000) −1.1339** (0.0000)

PRICE 0.8246** (0.0000) 0.6316** (0.0024) 0.0006 (0.9915) 0.0028 (0.9626) 0.0069 (0.9073) 0.0136 (0.8190) 0.0236 (0.6925) 0.0377 (0.5276) 0.0573 (0.3380) 0.0840 (0.1606) 0.1203* (0.0450) 0.1695** (0.0048) 0.2364** (0.0001) 0.3279** (0.0000) 0.4539** (0.0000) 0.6269** (0.0000) 0.8610** (0.0000) 1.1691** (0.0000) 1.5297** (0.0000) 1.7810** (0.0000) 1.0179** (0.0000)

LCTAT

−0.5622** (0.0002) −0.0004 (0.9929) −0.0017 (0.9686) −0.0041 (0.9223) −0.0080 (0.8489) −0.0137 (0.7443) −0.0216 (0.6073) −0.0322 (0.4441) −0.0459 (0.2740) −0.0637 (0.1289) −0.0865* (0.0390) −0.1157** (0.0057) −0.1533** (0.0002) −0.2025** (0.0000) −0.2689** (0.0000) −0.3649** (0.0000) −0.5218** (0.0000) −0.8270** (0.0000) −1.3164** (0.0000) −1.1340** (0.0000)

D1

−0.1160 (0.4577) −0.0001 (0.9980) −0.0005 (0.9910) −0.0013 (0.9773) −0.0026 (0.9542) −0.0046 (0.9188) −0.0076 (0.8672) −0.0117 (0.7956) −0.0174 (0.7002) −0.0251 (0.5792) −0.0353 (0.4349) −0.0488 (0.2794) −0.0667 (0.1382) −0.0906* (0.0432) −0.1226** (0.0059) −0.1652** (0.0002) −0.2246** (0.0000) −0.2927** (0.0000) −0.2696** (0.0000) −0.1574** (0.0000)

D2

−0.3595** (0.0001) −0.0003 (0.9919) −0.0012 (0.9647) −0.0030 (0.9137) −0.0057 (0.8350) −0.0096 (0.7260) −0.0148 (0.5877) −0.0217 (0.4274) −0.0306 (0.2633) −0.0421 (0.1242) −0.0569* (0.0375) −0.0765** (0.0052) −0.1027** (0.0002) −0.1390** (0.0000) −0.1919** (0.0000) −0.2757** (0.0000) −0.4286** (0.0000) −0.7366** (0.0000) −1.1474** (0.0000) −1.2400** (0.0000)

D3

M.C. Dong et al.

North American Journal of Economics and Finance xxx (xxxx) xxx–xxx

North American Journal of Economics and Finance xxx (xxxx) xxx–xxx

M.C. Dong et al.

References Altman, E. I. (1968). Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. The Journal of Finance, 23, 589–609. Bassett, G., & Koenker, R. (1982). An empirical quantile function for linear models with iid errors. Journal of the American Statistical Association, 77, 407–415. Beaver, W. H. (1966). Financial ratios as predictors of failure. Journal of Accounting Research, 4, 71–111. Beaver, W. H. (1968). Market prices, financial ratios, and the prediction of failure. Journal of Accounting Research, 6, 179–192. Bottai, M., Cai, B., & McKeown, R. E. (2010). Logistic quantile regression for bounded outcomes. Statistics in Medicine, 29, 309–317. Bradley, A. P. (1997). The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition, 30, 1145–1159. Campbell, J., Hilscher, J., & Szilagyi, J. (2008). In search of distress risk. The Journal of Finance, 63, 2899–2939. Chava, S., & Jarrow, R. A. (2004). Bankruptcy prediction with industry effects. Review of Finance, 8, 537–569. Chen, C. W. S., Gerlach, R., & Wei, D. C. M. (2009). Bayesian causal effects in quantiles: Accounting for heteroscedasticity. Computational Statistics & Data Analysis, 53, 1993–2007. Clogg, C. C., Petkova, E., & Haritou, A. (1995). Statistical methods for comparing regression coefficients between models. American Journal of Sociology, 100, 1261–1293. Ding, A. A., Tian, S., Yu, Y., & Guo, H. (2012). A class of discrete transformation survival models with application to default probability prediction. Journal of the American Statistical Association, 107, 990–1003. Efron, B., Hastie, T., Johnstone, I., & Tibshirani, R. (2004). Least angle regression. The Annals of Statistics, 32, 407–499. Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27, 861–874. Hosmer, D. W., Lemeshow, S., & Sturdivant, R. X. (2013). Applied logistic regression (3rd ed). Wiley Interscience. Koenker, R. (2005). Quantile regression. New York: Cambridge University Press. Koenker, R., & Bassett, G. (1978). Regression quantiles. Econometrica, 46, 33–50. McFadden, D. (1974). Conditional logit analysis of qualitative choice behavior. In P. Zarembka (Ed.). Frontiers in econometrics (pp. 105–142). New York: Academic. Meier, L., Van de Geer, S., & Bühlmann, P. (2008). The group lasso for logistic regression. Journal of the Royal Statististical Society – Series B, 70, 53–71. Memmel, C., Gunduz, Y., & Raupach, P. (2015). The common drivers of default risk. Journal of Financial Stability, 16, 232–247. Ohlson, J. A. (1980). Financial ratios and the probabilistic prediction of bankruptcy. Journal of Accounting Research, 18, 109–131. Orsini, N., & Bottai, M. (2011). Logistic quantile regression in Stata. Stata Journal, 11, 327–344. Paternoster, R., Brame, R., Mazerolle, P., & Piquero, A. (1998). Using the correct statistical test for equality of regression coefficients. Criminology, 36, 859–866. Pereira, J. M., Basto, M., & Ferreira da Silva, A. (2016). The logistic lasso and ridge regression in predicting corporate failure. Procedia Economics and Finance, 39, 634–641. Schwaab, B., Koopman, S. J., & Lucas, A. (2016). Global credit risk: World, country and industry factors. Journal of Applied Econometrics, 32, 296–317. Shumway, T. (2001). Forecasting bankruptcy more accurately: A simple hazard model. The Journal of Business, 74, 101–124. Siao, J. S., Hwang, R. C., & Chu, C. K. (2016). Predicting recovery rates using logistic quantile regression with bounded outcomes. Quantitative Finance, 16, 777–792. Switzer, L. N., Wang, J., & Tu, Q. (2016). Corporate governance and default risk in financial firms over the post financial crisis period: international evidence. Melbourne Business School, 2016 Financial Institutions, Regulation & Corporate Governance (FIRCG) Conference. Tian, S., Yu, Y., & Guo, H. (2015). Variable selection and corporate bankruptcy forecasts. Journal of Banking & Finance, 52, 89–100. Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statististical Society – Series B, 58, 267–288. Yu, K., Lu, Z., & Stander, J. (2003). Quantile regression: Applications and current research areas. Journal of the Royal Statististical Society – Series D, 52, 331–350. Zmijewski, M. E. (1984). Methodological issues related to the estimation of financial distress prediction models. Journal of Accounting Research, 22, 59–82.

17