European Journal of Operational Research 201 (2010) 838–846
Contents lists available at ScienceDirect
European Journal of Operational Research journal homepage: www.elsevier.com/locate/ejor
Decision Support
Support vector machines for default prediction of SMEs based on technology credit Hong Sik Kim, So Young Sohn * Department of Information and Industrial Engineering, Yonsei University, 134 Shinchon-dong, Seoul 120-749, Republic of Korea
a r t i c l e
i n f o
Article history: Received 14 August 2008 Accepted 25 March 2009 Available online 1 April 2009 Keywords: Support vector machines Default prediction model Small and medium enterprises
a b s t r a c t In Korea, many forms of credit guarantees have been issued to fund small and medium enterprises (SMEs) with a high degree of growth potential in technology. However, a high default rate among funded SMEs has been reported. In order to effectively manage such governmental funds, it is important to develop an accurate scoring model for selecting promising SMEs. This paper provides a support vector machines (SVM) model to predict the default of funded SMEs, considering various input variables such as financial ratios, economic indicators, and technology evaluation factors. The results show that the accuracy performance of the SVM model is better than that of back-propagation neural networks (BPNs) and logistic regression. It is expected that the proposed model can be applied to a wide range of technology evaluation and loan or investment decisions for technology-based SMEs. Ó 2009 Elsevier B.V. All rights reserved.
1. Introduction As globalization and competition among countries throughout the world intensifies, technology has become an important factor contributing to the competitiveness and development of a country’s industry. Because of these changes to the corporate environment, many small and medium enterprises (SMEs) require an immense amount of funds for technology development and commercialization. However, most SMEs are in financially difficult positions in terms of the circulation of cash flow. To resolve the funding problem, various types of technology funds at the governmental level have been made available to boost SMEs’ economic activities based on their technology scorecard. However, due to an incorrect scorecard, the default rate of selected SMEs turns out to be higher than the default rate of companies that received loans based on simply their financial statements. Therefore, it is essential to develop an accurate technology score model for SMEs for the efficient management of governmental funds. Various methods have been used to predict the default of enterprises. Beaver (1966) originally proposed the use of univariate analysis on financial ratios to predict the problem. Altman (1968), Altman et al. (1977), and Pompe and Bilderbe (2005) used Multiple Discriminant Analysis (MDA) to develop default prediction formulas. However, MDA requires a homogeneous variance assumption for both bankrupt and non-bankrupt firms. Later research examining bankruptcy (Ohlson, 1980; Aziz et al., 1988; Aziz and Lawson, 1989) favors a logistic regression (logit) over MDA for both theoretical and empirical reasons. The logit model requires less restrictive statistical assumptions and offers better empirical discrimination (Zavgren, 1983). However, the strict assumptions of traditional statistical models and pre-existing functional forms relating response variables to predictor variables limit application in the real world. In the 1980s, artificial intelligence (AI) techniques, particularly rule-based expert systems, case-based reasoning systems (Bryant, 1997; Buta, 1994), and machine learning techniques such as artificial neural networks (ANNs) have been successfully applied to default prediction (Desai et al., 1997; Elmer and Borowski, 1988; Jensen, 1992; Malhotra and Malhotra, 2002; Markham and Ragsdale, 1995; Patuwo et al., 1993; Srinivasan and Ruparel, 1990; West, 2000; Zhang, 2000; Zhang et al., 1999). In particular, ANNs are powerful tools for pattern recognition and pattern classification due to their non-linear, non-parametric adaptive-learning properties, and many studies have been conducted that have compared ANN with other classification techniques. These studies showed that the accuracy of ANN is better than that of other techniques. However, ANN has some shortcomings. First, ANN depends on the researchers’ experience or knowledge to preprocess data in order to select control parameters. Second, it is difficult to generalize the results due to overfitting. Third, it is difficult for ANN to explain the prediction results due to its lack of explanatory power. In order to overcome these shortcomings, this paper proposes a support vector machine (SVM) to build a default prediction model for a technology credit guarantee fund that supports technology-based SMEs. Developed by Vapnik (1998), SVM is gaining popularity due to many attractive features and excellent generalized performance with a wide range of problems. In addition, SVM embodies the structural * Corresponding author. Tel.: +82 2 2123 4014; fax: +82 2 364 7807. E-mail address:
[email protected] (S.Y. Sohn). 0377-2217/$ - see front matter Ó 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.ejor.2009.03.036
H.S. Kim, S.Y. Sohn / European Journal of Operational Research 201 (2010) 838–846
839
risk minimization principle, which has been shown to be superior to the traditional empirical risk minimization principle employed by conventional neural networks. To create a default prediction model using SVM, this study considered various input variables including not only financial ratios and the general characteristics of SMEs, but also technology evaluation factors. Also considered were economic indicators such as the consumer price index and the exchange rate since the default rate is sensitive to changes in environmental conditions because of a lack of financial resources. In order to evaluate the prediction accuracy of SVM, this study also compared its performance with that of logistic regression analysis (Logit) and back-propagation neural networks (BPNs). The remainder of this paper is organized as follows. Section 2 introduces a support vector method for the prediction of defaults in funded small and medium enterprises. Section 3 presents research data for building the default prediction model. Section 4 provides an overview of empirical data analysis. Finally, Section 5 presents the summary of this study as well as future research issues. 2. Support vector machines The support vector machines (SVMs) are classification techniques based on statistical learning theory (Vapnik, 1995; Vapnik, 1998). SVM produces a binary classifier, so-called optimal separating hyperplanes, through an extremely non-linear mapping of the input vectors into the high-dimensional feature space. SVM constructs a linear model to estimate the decision function using non-linear class boundaries based on support vectors. If the data are linearly separated, SVM trains linear machines for an optimal hyperplane that separates the data without error and into the maximum distance between the hyperplane and the closest training points. The training points that are closest to the optimal separating hyperplane are called support vectors. There are some advantages to using SVMs (Shin et al., 2005): (1) there are only two free parameters to be chosen, namely, the upper bound and the kernel parameter; (2) the solution of SVM is unique, optimal, and global since the training of an SVM is done by solving a linearly constrained quadratic problem; (3) SVMs are based on the structural risk minimization principle, which means that this type of classifier minimizes the upper bound of the actual risk, compared with other classifiers that minimize the empirical risk. Due to the above advantages, since SVM was introduced from a statistical learning theory by Vapnik, a number of studies have been completed concerning its theory and applications. Applications include financial time-series forecasting (Mukherjee et al., 1997; Tay and Cao, 2001), marketing (Ben-David and Lindenbaum, 1997), estimating manufacturing yields (Stoneking, 1999), text categorization (Joachims, 2002), face detection using images (Osuna et al., 1997), handwritten digit recognition (Burges and Schokopf, 1997; and Cortes and Vapnik, 1995), and medical diagnosis (Tarassenko et al., 1995). Next, it is necessary to briefly describe the basic SVM concepts for typical two-class classification problems. Let us define labeled training examples [xi, yi], consisting of an input vector xi 2 Rn, and a class value yi 2 1, 1, i = 1, . . . , I. For the linearly separable case, the decision rules defined by an optimal hyperplane separating the binary decision classes are given in the following equation in terms of the support vectors:
Y ¼ sign
N X
! yi ai ðx xi Þ þ b ;
ð1Þ
i¼1
where Y is the outcome, yi is the class value of the training example xi, and represents the inner product. The vector corresponds to an input and the vectors xi, i = 1, . . . , N, are the support vectors. In Eq. (1), b and ai are parameters that determine the hyperplane. For the non-linearly separable case, a high-dimensional version of Eq. (1) is given as follows:
Y ¼ sign
N X
! yi ai Kðx; xi Þ þ b :
ð2Þ
i¼1
The function K is defined as the kernel function for generating the inner products to construct machines with different types of nonlinear decision surfaces in the input space. There are several kernel functions:
Radial basis function ðRBFÞ :
Kðx; xi Þ ¼ expfcðx xi Þ2 g;
Two-layer neural network :
ð3Þ
Kðx; xi Þ ¼ ðcx xi þ rÞd ;
ð4Þ
Kðx; xi Þ ¼ tanhðcx xi þ rÞ;
ð5Þ
Polynomial kernel of degree d :
where c > 0; r, d 2 N; c 2 R. Based on this method, this study defines the default problem as a non-linear problem and uses the RBF kernel to optimize the hyperplane.
3. Empirical case study 3.1. Input variables description In this section, the proposed approach is applied to the technology credit fund recipient, which includes the SMEs supported on the basis of their technology scores evaluated during 1997–2002. The data contain not only financial ratios, general characteristics of the company such as the sales per employee, history of the company and whether or not the firm is listed on the stock market, but also technology evaluation scores. Specific descriptions of the obtained data are shown in Tables 1–4. Table 1 shows not only the general enterprise attributes but also SME characteristics such as whether or not it is a venture company (C3) certified by the SMBA (Small and Medium Business Administration of Korea). These characteristics are considered in order to build the default prediction model. Some SME characteristics include the following. First, this study considers ‘listed on stock market or not’ as a variable to build the default prediction model since the financial status of SMEs listed in stock markets such as the KOSPI Korean Stock Price Index (KOSPI), Korea Securities Dealers Automated Quotation (KOSDAQ) or other stock markets is more stable than SMEs that are not listed
840
H.S. Kim, S.Y. Sohn / European Journal of Operational Research 201 (2010) 838–846
Table 1 Description of SMEs characteristics. Attributes
Description
Listed on the stock market (C1) Age of company (C2) Venture company (C3) External audit (C4) Foreign investment (C5) Expert manager (C6) Patent (C7) Joint company (C8)
KOSPI, KOSDAQ, or other exchange = 1, or not = 0
a
Certified by SMBAa = 1, or not = 0 External audit = 1, or not = 0 Investment by foreigners = 1, or not = 0 Expert manager = 1 or not = 0 Patent = 1 or not 0 Consortium = 1, or not = 0
SMBA (Small and Medium Business Administration).
Table 2 Description of financial ratios. Attributes
Description
Net income to total assets (F1)
Net Income 100 ðBeginning total assets þ Ending total assetsÞ=2
Net income to stockholder’s equity (F2)
Net income 100 ðBeginning stockholder’s equity þ Ending stockholder’s equityÞ=2
Net income to sales (F3)
Net income 100 ðBeginning sales þ Ending salesÞ=2
Total asset turnover (F4)
Sales 100 ðBeginning assets þ Ending assetsÞ=2
Stockholder’s equity turnover (F5)
Sales 100 ðBeginning stockholder’s equity þ Ending stockholder’s equityÞ=2
Total assets growth rate (F6)
Ending total assets 100 100 Beginning total assets
Stockholders’ equity growth rate (F7)
Ending stockholder’s equity 100 100 Beginning stockholder’s equity
Sales growth rate (F8)
Ending sales 100 100 Beginning sales
Debt ratio (F9)
Ending sales 100 100 Beginning sales
SMBA (Small and Medium Business Administration).
Table 3 Technology evaluation factors and 16 attributes used for scorecard. Factors
Abbreviation
Scale
Attributes
Management
KMA TEPS MAS FSS HRS
T1 T2 T3 T4 T5
5 5 5 5 5
Technology knowledge Technology experience Management ability Fund supply Human resources
Technology
ETDS OTDS NTS TSS TCS
T6 T7 T8 T9 T10
5 5 5 10 10
Environment of technology development Output of technology development (e.g. patents) New technology development Technology superiority Technology commercialization potential
Marketability
MPS MCS PCS
T11 T12 T13
5 5 10
Market potential Market characteristics Product competitiveness
Profitability
SPS ASS PFS
T14 T15 T16
5 5 5
Sales schedule Business progress (new*) or amount of sales (old*) Return on investment (new*) or profitability (old*)
*
New: less than 3 year old companies; old: older than 3 years.
on a stock market. If an applicant SME was certified with a venture enterprise, it is easier to obtain support from a government-related financing service. Also, when an SME is audited by an external organization, this indicates that its financial status is healthier than those not audited. Several variables are also considered for default prediction for SMEs: whether or not a SME is audited by an external
H.S. Kim, S.Y. Sohn / European Journal of Operational Research 201 (2010) 838–846
841
Table 4 Economic indicators. Attributes
Description
E1 E2 E3 E4 E5 E6 E7 E8 E9 E10
Total business environment index Economic situations index of SMEs Economic preceding index Business survey index KOSPI (Korean Composite Stock Price Index) Operation index of SME Consumer price index An earning rate of the national bonds in 3 years The exchange rate of won per dollar Oil price
organization, whether or not an SME has patents, whether an SME is part of a joint company and whether or not an SME is managed by an expert manager. Here expert manager is neither entrepreneur nor founder but manager with professional experience in the related field. Table 2 shows various financial ratios, and these are considered in building the default prediction model. Generally, financial ratios represent many aspects of a business and are an integral part of financial statement analysis. Financial ratios are categorized according to the financial aspect of the business, which the ratio measures. Liquidity ratios measure the availability of cash to pay debt. Activity ratios measure how quickly a firm converts non-cash assets to cash assets. Debt ratios measure the firm’s ability to repay long-term debt. Profitability ratios measure the firm’s use of its assets and control of its expenses to generate an acceptable rate of return. Also, Altman (1968) compiled a list of 22 financial ratios and classified them into one of five categories (liquidity, profitability, leverage, solvency, and activity). VanHorne (1989) evaluated financial performance by liquidity ratios, debt ratios, coverage ratios, and profitability ratios. In addition, Weston and Thomas (1985) considered financial performance in terms of leverage ratios, activity ratios, growth ratios, and valuation ratios. Based on these previous literature reviews, nine variables were selected with respect to activity, profitability, liquidity, and growth ratios. The variables displayed in Tables 1 and 2 have been used to build a default prediction model for a general company. However, since the aim of this research is to establish a default prediction model for technology-based SMEs that were supported by technology credit guarantee funds from the government, it is necessary to add those variables that reflect the technology characteristics of SMEs. In addition, economy variables are considered to accommodate for its influence on default as displayed in both Tables 3 and 4. A technology scorecard typically attempts to examine the firms that own technology with many technology-related attributes. These multi-attributes are used differently according to the purpose of evaluation and technology evaluation institutes. Nevertheless, multi-attributes generally consist of factors such as the manager’s integrity, level of technology, marketability of technology, technology profitability and external environmental factors. As shown in Table 3, 16 technology attributes were used to build the default prediction model of for the technology credit recipients (Sohn et al., 2005, 2007; Sohn and Kim, 2007; Jeon and Sohn, 2008; Kim and Sohn, 2007; Moon and Sohn, 2008). Table 4 shows various economic indicators used to construct the default prediction model for SMEs. The default of firms has a close relationship with economic situations and, in particular, small and medium enterprises are more sensitive to changes in economic conditions. Therefore, some economic indicators are considered input variables. It is important to determine which economic indicators pertain to default by SMEs. Ten economic indicators are considered, supplied officially by the Korea National Statistical Office (KNSO, 2007) as shown in Table 4. These indicators are selected based on not only discussions with the experts who work in related fields (Korea Technology Credit Guarantee fund; Small and Business Corporation; Small and Medium Business Administration) but also previous studies (Ceylan and Ozturk, 2004). First, in detail, three indicators such as total business environment index, economic situations index of SMEs, and operation index of SMEs are suitable to reflect short-term economic conditions. Second, the business survey index and KOSPI are used to forecast microeconomic conditions. Third, an earning rate of the national bonds in three years and the exchange rate of the Korean won per dollar would have a close relationship with the technology-based credit loan for SMEs. Finally, the total economic proceeding index, consumer price index, and oil price are the macro indicators, which would be used to reflect the economic activity of SMEs. 3.2. Data preprocessing This section will introduce the data preprocessing that are used to build the default prediction model. After eliminating missing cases, 4590 cases remained. Among them, 907 (19.76%) were defaulted cases and 3683 (80.24%) were not. Here, default is declared if a funded SME corresponds to one of the following situations: delayed payback, issuance of a bad check, failed product commercialization, bad credibility of manager, closed business or corporate reorganization procedure in three years after they received technology funds. Otherwise, the remaining cases are considered to be non-default. Generally, the financial ratios of SMEs may be contaminated by some degree of error since most SMEs with technology evaluation show weak financial stability. Therefore, if these data are not cleaned or eliminated, the established model may be unstable. Thus, to build a more accurate default prediction model, the abnormal cases, which had the top 1% and the bottom 1% of each financial ratio, were eliminated. After eliminating these abnormal data, 3827 cases remained. Among them, 724 cases (18.92%) were defaulted cases and 3103 cases (81.08%) were not. As seen in the data, the number of default cases is smaller than that of non-default cases. Therefore, over-sampling was performed: 724 default cases and 724 non-default cases. The data set is arbitrarily split into two subsets; about 80% of the data is used for a training set and 20% for a validation set. The training data for the SVM is entirely used to construct the model and the validation data is used to test the results with data that are not utilized to develop the model. In the case of a back-propagation neural network, the data are divided into three subsets: a training set of 60%, a validation set of 20%, and a test data set of 20%.
842
H.S. Kim, S.Y. Sohn / European Journal of Operational Research 201 (2010) 838–846
Additionally, in order to validate the classifier to the data, a K-fold cross-validation procedure (Weiss and Kulikowski, 1991) was applied. In K-fold cross-validation, the original sample is partitioned into K subsamples. Of the K subsamples, a single subsample is retained as the validation data for testing the model, while the remaining K 1 subsamples are used as training data. The cross-validation process is then repeated K times (the folds), with each of the K subsamples used exactly once as the validation data. The K results from the folds then can be averaged (or otherwise combined) to produce a single estimation. Based on this procedure, the data were divided into five subsamples and the average value of k results of the validation data set is considered the accuracy of the default prediction model. 3.3. Construction of SVM model One of the most important factors in building the default prediction model using SVM is the selection of the kernel function. Generally, there are many types of kernel functions such as the radial basis function (RBF), a polynomial, two-layer neural network and etc. In this paper, the RBF kernel function is used as the default kernel function, primarily for four reasons (Hsu et al., 2004): (1) this type of kernel makes it possible to map the non-linear boundaries of the input space into a higher dimensional feature space; (2) in terms of performance, Keerthi and Lin (2003) showed that the linear kernel with a parameter C has the same performance as the RBF kernel with parameters (C, c); (3) when examining the number of hyper parameters, the polynomial kernel has more hyper parameters than the RBF kernel; and (4) the RBF kernel has fewer numerical difficulties because the kernel values lie between zero and one, while the polynomial kernel values may go to infinity or zero when the degree is large. In view of these advantages, the RBF kernel is used to build the default prediction model for the technology credit guarantee fund for SMEs. Once the RBF kernel is selected as the default kernel function, it is necessary to decide two parameters associated with RBF kernels: C and c. The upper bound C and the kernel parameter c play crucial roles in the performance of SVMs. Improper selection of these parameters can be counterproductive. Nevertheless, there is little general guidance to determine the parameter values of SVM. Hsu et al. (2004) suggested a practical guideline to SVM using grid-search and cross-validation, and these guidelines are utilized in this study. This study performed a grid-search on C and c using fivefold cross-validation. Basically, all the pairs of (C, c) are tested, and the one with the best crossvalidation accuracy is selected. It is understood that utilizing increasing sequences of C and c is a practical method for identifying optimal parameters (for example, C = 104, 103, 102, . . . , 102, 103, 104 c = 104, 103, 102, . . . , 102, 103, 104) For this analysis The MATLAB SVM toolbox is utilized. 3.4. Construction of logistic regression Logistic regression (Logit) analysis has also been used to investigate the relationship between binary or ordinal response probability and explanatory variables. The method fits a linear logistic regression model for binary or ordinal response data by the method of maximum likelihood. Ohlson (1980) was one of the first users of Logit analysis in the context of financial distress. This technique weighs the independent variables and assigns a probability of default to each company in a sample. The advantage of this method is that it does not assume multivariate normality and equal covariance matrices. Logit analysis incorporates non-linear effects, and uses the Logit cumulative function in predicting a default
Probability of default ¼
1 1 ¼ : 1 þ ez 1 þ eðb0 þb1 x1 þb2 x2 þþb1 x1 Þ
ð6Þ
The stepwise method is used to select significant variables. The stepwise selection process terminates if no further variable can be added to the model at a given significance level, or if the variable just entered into the model is the only variable removed in the subsequent elimination. Table 5 shows the result of logistic regression using stepwise selection process. As a result of stepwise selection, 14 out of 43 input variables turn out to be significant on default at 10% level. Here, parameter is estimated for default probability. In case of T1, when SME supported technology credit guarantee fund is with low technology knowledge score (T1), default rate of technology credit guarantee fund decreases. These 14 input variables are used for back-propagation neural network and support vector machines for comparing the accuracy of proposed model.
Table 5 Maximum likelihood estimates of logistic model for default prediction. Parameter
DF
Estimate
Error
Chi-square
Pr > Chisq.
Intercept T1 T4 T7 T10 T11 T12 T15 ECO2 ECO3 F2 F4 F6 F9 C1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
12.7835 0.2715 0.5873 0.1792 0.2324 0.2003 0.1965 0.0837 0.2797 0.1077 0.00482 0.1559 0.00291 0.000836 0.3466
6.4323 0.054 0.0811 0.0697 0.0436 0.0958 0.0532 0.0462 0.0796 0.0206 0.00215 0.0695 0.000512 0.000307 0.118
3.9498 25.2902 52.4862 6.6143 28.4326 4.3727 13.6221 3.2829 12.3435 27.3435 5.0256 5.0275 32.3845 7.4329 8.6194
0.0469 <0.0001 <0.0001 0.0101 <0.0001 0.0365 0.0002 0.07 0.0004 <0.0001 0.025 0.0249 <0.0001 0.0064 0.0033
843
H.S. Kim, S.Y. Sohn / European Journal of Operational Research 201 (2010) 838–846 Table 6 The cross-validated accuracy per (C, c).
c
C
4
10 103 102 101 100 101 102 103 104
104
103
102
101
100
101
102
103
104
47.72 47.72 47.72 58.70 59.33 62.85 65.95 66.16 65.54
47.72 47.72 47.72 56.97 62.23 64.02 63.88 62.57 62.63
47.72 47.72 47.72 48.90 62.16 62.44 62.71 62.37 62.37
47.72 47.72 47.72 47.72 64.57 64.71 64.71 64.71 64.71
47.72 47.72 47.72 50.07 60.92 60.92 60.92 60.92 60.92
47.72 47.72 47.72 47.72 57.67 57.74 57.74 57.74 57.74
47.72 47.72 47.72 47.72 57.19 57.25 57.25 57.25 57.25
47.72 47.72 47.72 47.72 56.70 56.70 56.70 56.70 56.70
47.72 47.72 47.72 47.72 56.70 56.70 56.70 56.70 56.70
3.5. Construction of back-propagation neural networks In this study, three-layer fully connected back-propagation neural networks (BPN) are used as benchmarks. With BPNs, this study varies the number of nodes in the hidden layer and stopping criteria for training. In particular, 7, 14, and 21 hidden nodes are used for each stopping criterion since BPN does not have a general rule for determining the optimal number of hidden nodes (Kim, 2003). For the stopping criteria of BPN, this study allows 2000 learning epochs per training. E-miner of SAS 9.1 was used to perform the BPN experiments. 4. Results 4.1. SVM model After conducting the grid search on the training data, the optimal (C, c) was found to be (103, 104) with a cross-validated accuracy of 66.16%. Table 6 summarizes the results of the grid search using the cross-validated accuracy as an evaluation criterion. As mentioned above, a fivefold cross-validation procedure was applied. Thus, the accuracy displayed in Table 6 is the average value of k results of validation data sets. All results for the fivefold cross-validation are displayed in Appendix A. Additionally, Fig. 1 shows the sensitivity of SVM to various parameters. The experimental results show that the prediction performance of SVM is sensitive to various parameters. Thus, this result shows that simultaneous optimization of parameters is needed for the best prediction. 4.2. BPN and logistics regression models As mentioned in Section 3.2, in BPN, each data set is divided into three subsets: a training set of 60 % (868), a validation set of 20% (290), a test data set of 20% (290) of the total data (1448), respectively. Generally, the accuracy of BPN can be influenced by the number of hidden layers, and the BPN was performed when the number of the hidden layer was 7, 14 and 21. The results of three-layer BPN are summarized in Table 7 and showed that the average accuracy is the highest at 64.23%, when the number of hidden layers is 7. In Logistic regression, each data set is divided into three subsets: a training set of 80% (1158), a validation set of 20% (290) of the total data (1448), respectively. Table 8 shows the accuracy of default for five data sets extracted from the fivefold cross-validation. As a result, it can be inferred that the range of accuracy for the validation data set ranges from 61.73% to 66.44% and the average accuracy using logistic regression is around 64.16%.
Fig. 1. Prediction performance of SVM according to various parameters.
844
H.S. Kim, S.Y. Sohn / European Journal of Operational Research 201 (2010) 838–846
Table 7 Classification accuracy results of back-propagation neural networks. # of hidden Node
Set 1 (%)
Set 2 (%)
Set 3 (%)
Set 4 (%)
Set 5 (%)
Average (%)
Training Validation Testing
7
68.24 69.21 62.76
66.01 63.10 65.52
66.36 64.14 61.38
68.24 62.76 66.44
71.04 67.48 65.05
67.98 65.34 64.23
Training Validation Testing
14
68.24 69.21 61.73
66.01 63.10 63.45
66.36 64.14 61.38
68.24 62.76 62.63
71.04 67.48 64.36
67.98 65.34 62.71
Training Validation Testing
21
65.71 65.74 62.76
66.48 63.10 64.14
66.01 63.79 63.1
67.32 64.48 65.4
63.91 67.13 65.4
65.89 64.85 64.16
Table 8 Classification accuracy results of the logistics regression model.
Training Validation
Set 1 (%)
Set 2 (%)
Set 3 (%)
Set 4 (%)
Set 5 (%)
Average (%)
64.51 63.45
64.16 63.79
66.35 65.40
64.34 61.73
64.37 66.44
64.75 64.16
Table 9 The prediction accuracy of SVM, BPN, and logistic regression (hit ratio: %).
Accuracy
SVM
Logistic regression
BPN
66.16%
64.16%
64.23%
4.3. Prediction performance comparisons Table 9 compares the best prediction of SVM, logistic regression and BPN in validation data and shows that the SVM has the highest accuracy of 66.16% among the three methods. Table 9 also shows that SVM outperforms both logistic regression and BPN by 2.00% and 1.93%, respectively, for validation data. Based on this result, the SVM model is recommended for default prediction for technology funding. Overall, relatively low accuracy was observed from the three approaches. Improvement level made by SVM may appear to be low, too. But in view of significant amount of defaulted loan associated with incorrect selection based the prediction model, the role of SVM is considered to be high. 5. Conclusion In Korea, many forms of credit guarantee have been issued for improvement of SMEs with a high degree of growth potential in technology. The Korean government has been funding the SMEs with superior technology. However, recently, high default rates among fund recipient SMEs have been reported. In order to effectively manage such funds for SMEs, an accurate default prediction model is needed. In this paper, SVM was applied to predict the default rate more accurately and compared with the existing methods such as logistic regression and back-propagation neural network. In order to build SVM model, a grid-search method was applied for finding optimal values for the parameters C and c that are most important in the RBF function, which is one of the kernel functions of the SVM model. A fivefold cross-validation procedure was also utilized to obtain more general results. For the selection of input variables, this study considered not only financial variables, which are widely used to predict the default of a company, but also variables reflecting the characteristics of funded SMEs such as economic conditions and technology evaluation scores, to build a more objective and accurate default prediction model. The result of empirical analysis showed that SVM outperformed the other methods. From these results, we consider that SVM can serve as the alternative method for the default prediction. It is expected that the proposed model can be applied to a wide range of technology evaluation and loan or investment decisions for technology-based SMEs. For future work, we intend to optimize the kernel function and parameters simultaneously. Also this study analyzed empirical data from 1997 to 2002. With a more extensive data set, further analyses can derive more general results. Acknowledgement This work was supported by the Korea Science and Engineering Foundation (KOSEF) grant funded by the Korea government (MEST) (No. R01-2008-000-11003-01).
845
H.S. Kim, S.Y. Sohn / European Journal of Operational Research 201 (2010) 838–846
Appendix 1 Accuracy (%) for SVM models using fivefold cross-validation
c
C
104 104 103 102 101 100 101 102 103 104
103
Set 1
Set 2
Set 3
Set 4
Set 5
Mean Set 1
Set 2
Set 3
Set 4
Set 5
Mean Set 1
Set 2
Set 3
Set 4
Set 5
Mean
48.28 48.28 48.28 61.72 60 60.34 65.17 67.24 64.48
47.93 47.93 47.93 57.24 55.17 56.55 61.03 62.07 62.41
48.28 48.28 48.28 57.93 59.66 64.48 67.24 66.21 66.21
46.02 46.02 46.02 60.21 63.67 70.93 70.59 69.55 70.93
48.10 48.10 48.10 56.40 58.13 61.94 65.74 65.74 63.67
47.72 47.72 47.72 58.70 59.33 62.85 65.95 66.16 65.54
47.93 47.93 47.93 58.62 60.69 61.72 63.10 61.72 61.72
48.28 48.28 48.28 52.76 58.97 64.48 61.72 61.03 61.03
46.02 46.02 46.02 51.90 65.40 69.90 71.28 69.55 69.55
48.10 48.10 48.10 57.79 61.25 62.98 62.98 62.63 62.93
47.72 47.72 47.72 56.97 62.23 64.02 63.88 62.57 62.63
47.93 47.93 47.93 47.93 61.03 59.31 59.31 59.31 59.31
48.28 48.28 48.28 48.28 62.76 60.00 60.00 60.00 60.00
46.02 46.02 46.02 51.90 65.40 69.90 71.28 69.55 69.55
48.10 48.10 48.10 48.10 62.98 61.94 61.94 61.94 61.94
47.72 47.72 47.72 48.90 62.16 62.44 62.71 62.37 62.37
10 103 102 101 100 101 102 103 104
10 103 102 101 100 101 102 103 104
48.28 48.28 48.28 48.28 58.62 61.03 61.03 61.03 61.03 101
Set 1
Set 2
Set 3
Set 4
Set 5
Mean Set 1
Set 2
Set 3
Set 4
Set 5
Mean Set 1
Set 2
Set 3
Set 4
Set 5
Mean
48.28 48.28 48.28 48.28 62.07 62.07 62.07 62.07 62.07
47.93 47.93 47.93 47.93 63.45 63.45 63.45 63.45 63.45
48.28 48.28 48.28 48.28 65.17 65.17 65.17 65.17 65.17
46.02 46.02 46.02 46.02 69.90 70.59 70.59 70.59 70.59
48.10 48.10 48.10 48.10 62.28 62.28 62.28 62.28 62.28
47.72 47.72 47.72 47.72 64.57 64.71 64.71 64.71 64.71
47.93 47.93 47.93 59.66 59.66 59.66 59.66 59.66 59.66
48.28 48.28 48.28 48.28 62.41 62.41 62.41 62.41 62.41
46.02 46.02 46.02 46.02 66.10 66.10 66.10 66.10 66.10
48.10 48.10 48.10 48.10 58.48 58.48 58.48 58.48 58.48
47.72 47.72 47.72 50.07 60.92 60.92 60.92 60.92 60.92
47.93 47.93 47.93 47.93 56.21 56.21 56.21 56.21 56.21
48.28 48.28 48.28 48.28 59.66 59.66 59.66 59.66 59.66
46.02 46.02 46.02 46.02 62.28 62.28 62.28 62.28 62.28
48.10 48.10 48.10 48.10 54.67 54.67 54.67 54.67 54.67
47.72 47.72 47.72 47.72 57.67 57.74 57.74 57.74 57.74
102 4
48.28 48.28 48.28 63.80 64.83 61.03 60.34 57.93 57.93 100
101 4
102
48.28 48.28 48.28 48.28 57.93 57.93 57.93 57.93 57.93 103
48.28 48.28 48.28 48.28 55.52 55.86 55.86 55.86 55.86 104
Set 1
Set 2
Set 3
Set 4
Set 5
Mean Set 1
Set 2
Set 3
Set 4
Set 5
Mean Set 1
Set 2
Set 3
Set 4
Set 5
Mean
48.28 48.28 48.28 48.28 55.52 55.52 55.52 55.52 55.52
47.93 47.93 47.93 47.93 55.52 55.52 55.52 55.52 55.52
48.28 48.28 48.28 48.28 58.62 58.62 58.62 58.62 58.62
46.02 46.02 46.02 46.02 61.60 61.94 61.94 61.94 61.94
48.10 48.10 48.10 48.10 54.67 54.67 54.67 54.67 54.67
47.72 47.72 47.72 47.72 57.19 57.25 57.25 57.25 57.25
47.93 47.93 47.93 47.93 54.83 54.83 54.83 54.83 54.83
48.28 48.28 48.28 48.28 58.28 58.28 58.28 58.28 58.28
46.02 46.02 46.02 46.02 60.21 60.21 60.21 60.21 60.21
48.10 48.10 48.10 48.10 54.67 54.67 54.67 54.67 54.67
47.72 47.72 47.72 47.72 56.70 56.70 56.70 56.70 56.70
47.93 47.93 47.93 47.93 54.83 54.83 54.83 54.83 54.83
48.28 48.28 48.28 48.28 58.28 58.28 58.28 58.28 58.28
46.02 46.02 46.02 46.02 60.21 60.21 60.21 60.21 60.21
48.10 48.10 48.10 48.10 54.67 54.67 54.67 54.67 54.67
47.72 47.72 47.72 47.72 56.70 56.70 56.70 56.70 56.70
48.28 48.28 48.28 48.28 55.52 55.52 55.52 55.52 55.52
48.28 48.28 48.28 48.28 55.52 55.52 55.52 55.52 55.52
References Altman, E.I., 1968. Financial ratios, discriminant analysis and prediction of corporate bankruptcy. Journal of Finance 23, 589–609. Altman, E.L., Edward, I., Haldeman, R., Narayanan, P., 1977. A new model to identify bankruptcy risk of corporation. Journal of Banking and Finance 1, 29–54. Aziz, A., Lawson, G., 1989. Cash flow reporting and financial distress models: Testing of hypotheses. Financial Management 18, 55–63. Aziz, A., Emanuel, D., Lawson, G., 1988. Bankruptcy prediction – An investigation of cash flow based models. Journal of Management Studies 25, 419–437. Beaver, W., 1966. Financial ratios as predictors of failure, empirical research in accounting: Selected studied. Journal of Accounting Research, 71–111. Ben-David, S., Lindenbaum, M., 1997. Learning distributions by their density levels: A paradigm for learning without a teacher. Journal of Computer and System Sciences 55, 171–182. Bryant, S.M., 1997. A case-based reasoning approach to bankruptcy prediction modeling. Intelligent Systems in Accounting, Finance & Management 6, 195–214. Burges, C.J.C., Schokopf, B., 1997. Improving the Accuracy and Speed of Support Vector Machines. Advances in Neural Information Processing Systems. MIT Press, Cambridge, MA. pp. 475–481. Buta, P., 1994. Mining for financial knowledge with CBR. AI Expert 9, 34–41. Ceylan, H., Ozturk, H.K., 2004. Estimating energy demand of turkey based on economic indicators using genetic algorithm approach. Energy Conversion and Management 45, 2525–2537. Cortes, C., Vapnik, V.N., 1995. Support vector networks. Machine Learning 20, 273–297. Desai, J.N., Conway, Overstreet Jr., G.A., 1997. Credit scoring models in the credit union environment using neural networks and genetic algorithms. IMA Journal of Management Math 8, 324–346. Elmer, P.J., Borowski, D.M., 1988. An expert system approach to financial analysis: The case of S&L bankruptcy. Financial Management 17, 66–76. Hsu, C.-W., Chang, C.-C., Lin, C.-J., 2004. A practical guide to support vector classification. Technical Report, Department of Computer Science and Information Engineering, National Taiwan University. Available at:
. Jensen, H.L., 1992. Using neural networks for credit scoring. Managerial Finance 18, 15–26. Jeon, H.J., Sohn, S.Y., 2008. The risk management for technology credit guarantee fund. Journal of the Operational Research Society, 59, 1624–1632. Joachims, T., 2002. Learning to Classify Text using Support Vector Machines. Kluwer Academic Publishers., London. Keerthi, S.S., Lin, C.J., 2003. Asymptotic behaviors of support vector machines with Gaussian kernel. Neural Computation 15, 1667–1689. Kim, K.J., 2003. Financial time series forecasting using support vector machines. Neurocomputing 55, 307–319.
846
H.S. Kim, S.Y. Sohn / European Journal of Operational Research 201 (2010) 838–846
Kim, Y.S., Sohn, S.Y., 2007. Technology scoring model considering rejected applicants and the effect of reject inference. Journal of the Operational Research Society 58, 1341– 1347. Malhotra, R., Malhotra, D.K., 2002. Differentiating between good credits and bad credits using neuro-fuzzy systems. European Journal of Operational Research 136, 190–211. Markham, I.S., Ragsdale, C.T., 1995. Combining neural networks and statistical predictions to solve the classification problem in discriminant analysis. Decision Sciences 26, 229–242. Moon, T.H., Sohn, S.Y., 2008. Technology scoring model for reflecting evaluator’s perception within confidence limits. European Journal of Operational Research 184, 981–989. Mukherjee, S., Osuna, E., Girosi, F., 1997. Nonlinear prediction of chaotic time series using support vector. In: Proceedings of the IEEE Workshop on Neural Networks for Signal Processing, Amelia Island, FL, pp. 511–520. Ohlson, J.A., 1980. Financial ratios and the probabilistic prediction of bankruptcy. Journal of Accounting Research, 109–131. Osuna, E., Freund, R., Girosi, F., 1997. Training support vector machines: An application to face detection. Proceedings of Computer Vision and Pattern Recognition, 130–136. Patuwo, E., Hu, M.H., Hung, M.S., 1993. Two-group classification using neural networks. Decision Sciences 24, 825–845. Pompe, P.P.M., Bilderbe, J., 2005. The prediction of bankruptcy of small- and medium-sized industrial firms. Journal of Business Venturing 20, 847–868. Shin, K.S., Lee, T.S., Kim, H.J., 2005. An application of support vector machines in bankruptcy prediction model. Expert Systems with Applications 28, 127–135. Sohn, S.Y., Kim, H.S., 2007. Random effects logistic regression model for default prediction of technology credit guarantee fund. European Journal of Operational Research 183, 472–478. Sohn, S.Y., Moon, T.H., Kim, S.H., 2005. Improved technology scoring model for credit guarantee fund. Expert Systems with Applications 28, 327–331. Sohn, S.Y., Kim, H.S., Moon, T.H., 2007. Predicting the financial performance index of technology fund for sme using structural equation model. Expert Systems with Applications 32, 890–898. Srinivasan, V., Ruparel, B., 1990. CGX: An expert support system for credit granting. European Journal of Operational Research 45, 293–308. Stoneking, D., 1999. Improving the manufacturability of electronic designs. IEEE Spectrum 36, 70–76. Tarassenko, L., Hayton, P., Cerneaz, N., Brady, M., 1995. Novelty detection for the identification of masses in mammograms. In: Proceedings fourth IEE International Conference on Artificial Neural Networks, Cambridge, pp. 442–447. Tay, F.E.H., Cao, L., 2001. Application of support vector machines in financial time series forecasting. Omega 29, 309–317. Van-Horne, J.C., 1989. Financial Management and Policy, 8th ed. Prentice-Hall, Inc., Englewood Cliffs, New Jersey. Vapnik, V.N., 1995. The Nature of Statistical Learning Theory. Springer, New York. Vapnik, V.N., 1998. Statistical Learning Theory. Springer, New York. Weiss, S., Kulikowski, C., 1991. Computer Systems That Learn. Morgan Kaufmann Publishers, Inc. West, D., 2000. Neural network credit scoring models. Computers and Operations Research 27, 1131–1152. Weston, J.F., Thomas, E., 1985. Copeland Managerial Finance, eighth ed. The Dryden Press, Hinsdale, IL. Zavgren, C., 1983. The Prediction of corporate failure: The state of the art. Journal of Accounting Literature 2, 1–37. Zhang, G.P., 2000. Neural networks for classification: A survey. IEEE Transactions Systems, Man, and Cybernetics – Part C: Applications and Reviews 30, 451–462. Zhang, G.P., Hu, M.Y., Patuwo, B.E., Indro, D.C., 1999. Artificial neural networks in bankruptcy prediction: General framework and cross-validation analysis. European Journal of Operational Research 116, 16–32.