Integrating nonlinear graph based dimensionality reduction schemes with SVMs for credit rating forecasting

Expert Systems with Applications 36 (2009) 7515–7518 Contents lists available at ScienceDirect Expert Systems with Applications journal homepage: ww...

Download PDF

173KB Sizes 2 Downloads 34 Views

Report

PDF Reader
Full Text

Expert Systems with Applications 36 (2009) 7515–7518

Contents lists available at ScienceDirect

Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa

Integrating nonlinear graph based dimensionality reduction schemes with SVMs for credit rating forecasting Shian-Chang Huang * Department of Business Administration, National Changhua University of Education, College of Management, No. 2, Shi-Da Road, Changhua 500, Taiwan

a r t i c l e

i n f o

Keywords: Kernel graph embedding Dimensionality reduction Support vector machine Multi-class classiﬁcation Credit rating

a b s t r a c t By integrating graph based nonlinear dimensionality reduction with support vector machines (SVMs), this study develops a novel prediction model for credit ratings forecasting. SVMs have been successfully applied in numerous areas, and have demonstrated excellent performance. However, due to the high dimensionality and nonlinear distribution of the input data, this study employed a kernel graph embedding (KGE) scheme to reduce the dimensionality of input data, and enhance the performance of SVM classiﬁers. Empirical results indicated that one-vs-one SVM with KGE outperforms other multi-class SVMs and traditional classiﬁers. Compared with other dimensionality reduction methods the performance improvement owing to KGE is signiﬁcant. Ó 2008 Elsevier Ltd. All rights reserved.

1. Introduction Credit rating assesses the credit worthiness of an individual, corporation, or even a country. Typically, credit rating tells a lender or investor the probability of the borrower being able to pay back a loan. Consequently, credit ratings are important determinants of risk premiums and even the marketability of corporate bonds. Recently, credit rating forecasting had been a critical issue in the banking industry. All banking institutes and their regulators attempt to search for a precise internal credit system to model the credit quality of their evaluation borrowers. Furthermore, subprimemortgage crisis in the later half of 2007 profoundly impacts the banking sector of US. The bank with the most accurate estimation of its credit risk will be the most proﬁtable. The objective of this study is thus to develop a reliable and accurate prediction models for risk assessment. The development of the corporate credit rating prediction model has attracted lots of research interests in academic and business community. Many researchers have attempted to construct automatic classiﬁcation systems using methods from data mining, such as statistical and artiﬁcial intelligence techniques. However, due the high dimensionality of input variables (both ﬁnancial and non-ﬁnancial information), this study combined a kernel graph embedding (KGE) scheme proposed by Yan et al. (2007) with multi-class SVMs to enhance our predictions. Numerous classiﬁcation techniques have been adopted for credit scoring. These techniques include (1) traditional statistical methods; for example, discriminant analysis, logistic regression * Tel.: +886 4 7232105 7420, mobile: +886 953092968; fax: +886 4 7211292. E-mail address: [email protected] 0957-4174/$ - see front matter Ó 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2008.09.047

(Steenackers & Goovaerts, 1989; Stepanova & Thomas, 2001), and Bayesian network, (2) non-parametric statistical models, such as k-nearest neighbor (Henley & Hand, 1997), (3) decision trees (Yobas, Crook, & Ross, 2000), and (4) neural networks (Desai, Crook, & Overstreet, 1996; West, 2000; Yobas et al., 2000). Recently, the support vector machine (SVM) method (Cristianini & Shawe-Taylor, 2000; Schoelkopf, Burges, & Smola, 1999; Vapnik, 1999), another form of neural networks, has become increasingly popular and is currently regarded as the state-of-the-art technique for regression and classiﬁcation applications. The formulation of SVM is believed to embody the structural risk minimization principle (a maximum margin classiﬁer), and thus to combine excellent generalization properties with a sparse model representation. SVMs exploit the idea of mapping input data into a high dimensional reproducing kernel Hilbert space (RKHS) where linear classiﬁcation is performed. However, owing to the large amounts of data from public ﬁnancial statements which can be used for corporate credit rating predictions, the large scale input data will make SVM classiﬁers infeasible due to the curse of dimensionality. Consequently, one needs to select key features from the raw data to reduce the dimensionality of the classiﬁcation problem. Dimensionality reduction have been studied extensively in both the statistics and machine learning communities during recent decades. Among dimensionality reduction, the linear algorithms principal component analysis (PCA) and linear discriminant analysis (LDA) have been the two most popular because of their relative simplicity and effectiveness. However, as indicated by Yan et al. (2007), in many real world problems there is no evidence that the data is sampled from a linear subspace. This motivates researchers to consider manifold based techniques for dimensionality reduction. Recently, various

7516

S.-C. Huang / Expert Systems with Applications 36 (2009) 7515–7518

manifold learning techniques, such as ISOMAP (Tenenbaum, de Silva, & Langford, 2000), locally linear embedding (LLE) (Roweis & Saul, 2000) and Laplacian eigenmap (Belkin & Niyogi, 2001) have been proposed which reduce the dimensionality of a ﬁxed input data set in a way that maximally preserve certain inter-point relationships. This research adopted the general framework of Yan et al. (2007) called kernel graph embedding (KGE) for dimensionality reduction. Their framework offers a uniﬁed view for understanding and explaining many of the popular dimensionality reduction algorithms. The kernelization of graph embedding applies the kernel trick on the linear graph embedding algorithm. Thus it can handle data with nonlinear distributions. To handle the high dimensionality of the input data, this study combined KGE with SVMs to increase the rating accuracy. KGE will reduce the dimensionality of input data and simultaneously eliminate irrelevant features. This combination will reduce the computational loading of SVMs and enhance the forecasting accuracy. Moreover, this study applies three types of multi-class SVMs, one-vs-one, one-vs-all, and multi-class SVMs to classify enterprize credit rating, and compares these SVM classiﬁers with traditional classiﬁers. Empirical results indicated that the performance of SVMs with KGE are promising. The performance improvement owing to KGE is signiﬁcant. The method developed here will help ﬁnancial institutions make good assessments about their credit risks, and substantially reduce their losses. The remainder of this paper is organized as follows: Section 2 describes the multi-class SVMs. Section 3 introduces the KGE algorithm. Subsequently, Section 4 describes the study data and discusses the empirical ﬁndings. Conclusions are ﬁnally given in Section 5. 2. Support vector machines The support vector machines (SVMs) were proposed by Vapnik (1999). Based on the structured risk minimization (SRM) principle, SVMs seek to minimize an upper bound of the generalization error instead of the empirical error as in other neural networks. SVM classiﬁers construct a hyperplane to separate the two classes (labeled y 2 f1; 1g) so that the margin (the distance between the hyperplane and the nearest point) is maximal. The SVM classiﬁcation function is formulated as follows:

y ¼ signðwT /ðxÞ þ bÞ;

ð1Þ

where /ðxÞ is called the feature, which is a nonlinear mapping from the input space x to the future space. The coefﬁcients w and b are estimated by the following optimization problem:

min Rðw; nÞ ¼ w;b

1 kwk2 þ C 2

l X

ni P 0;

ni ;

ð2Þ

i¼1

i ¼ 1; . . . ; l

i ¼ 1; . . . ; l;

ð3Þ ð4Þ

where C is a prescribed parameter, which evaluates the trade-off between the empirical risk and the smoothness of the model. After taking the Lagrangian and conditions for optimality, the dual solution of this convex optimization problem can be formulated as follows:

max DðaÞ ¼ a

l X

ai

i¼1

l 1X y y ai aj Kðxi ; xj Þ; 2 i;j¼1 i j

ð5Þ

with constraints,

0 6 ai 6 C;

i ¼ 1; . . . ; l

ai yi ¼ 0;

ð7Þ

i¼1

where a are Lagrangian multipliers, which are also the solution to the dual problem, and Kðxi ; xj Þ is the kernel function. b follows from the complementarity Karush–Kuhn–Tucker (KKT) conditions. The decision function is given by

f ðxÞ ¼ sign

l X

!

ai yi Kðx; xi Þ þ b :

ð6Þ

ð8Þ

i¼1

The value of the kernel is equal to the inner product of two vectors x and xi in the feature space, such that Kðx; xi Þ ¼ /ðxÞ/ðxi Þ. Any function that satisfying Mercer’s condition (Vapnik, 1999) can be used as the kernel function. 2.1. Multi-class support vector machine One approach to solving multi-class classiﬁcation problem is to consider the problem as a collection of binary classiﬁcation problems. k classiﬁers can be constructed, one for each class. The nth classiﬁer constructs a hyperplane between class n and the k 1 other classes. A majority vote across the classiﬁers or some other measure can then be applied to classify a new point. That is, a particular point is assigned to the class for which the distance from the margin, in the positive direction (i.e., in the direction in which class ‘‘one” lies rather than class ‘‘rest”), is maximal. This is oneagainst-rest method for multi-class classiﬁcation. hyperplanes can be constructed, sepaAlternatively, C k2 ¼ kðk1Þ 2 rating each class from each other class, and similarly some voting schemes can be applied. This is one-against-one method. The above two methods have been used widely in the support vector literature to solve multi-class classiﬁcation problems. Another way to solve multi-class problems is to construct a decision function by considering all classes at once (Weston & Watkins, 1999). One can generalize (2) to the following setting:

min Rðw; nÞ ¼ w;b

l X m X 1 ni ; kwm k2 þ C 2 i¼1 m–y

ð9Þ

i

with

wTyi /ðxi Þ þ byi P wTm /ðxi Þ þ bm þ 2 nm i

ð10Þ

nm i P 0;

ð11Þ

i ¼ 1; . . . ; l m 2 f1; . . . ; kg n yi :

This gives the decision function:

f ðxÞ ¼ arg maxðwTi /ðxÞ þ bi Þ; k

with

yi ðwT /ðxi Þ þ bi Þ P 1 ni ;

l X

i ¼ 1; . . . ; k:

ð12Þ

One can also ﬁnd the solution to this optimization problem in dual variables by ﬁnding the saddle point of the Lagrangian. This method is termed as MSVM. 3. Kernel graph embedding In this section, we present the dimensionality reduction method of Yan et al. (2007) and Cai et al. (2007). Given m samples m n d xi jm i¼1 2 R , dimensionality reduction aims at ﬁnding yi ji¼1 2 R , d n, where yi can represents xi . In the past decades, many algorithms, either supervised or unsupervised, have been proposed to solve this problem. These algorithms can all be interpreted in a general graph embedding framework of Yan et al. (2007). Given a graph G with m vertices, each vertex represents a data point. Let W be a symmetric m m matrix with W ij having the weight of the edge joining vertices i and j. The G and W can be deﬁned to characterize certain statistical or geometric properties of the data set. The purpose of graph embedding is to represent each

7517

S.-C. Huang / Expert Systems with Applications 36 (2009) 7515–7518

vertex of a graph as a low dimensional vector that preserves similarities between the vertex pairs, where similarity is measured by the edge weight. Let y ¼ ½y1 ; y2 ; . . . ; ym T be the map from the graph to the real line. The optimal y tries to minimize

X ðyi yj Þ2 W i;j

ð13Þ

i;j

under appropriate constraint. This objective function incurs a heavy penalty if neighboring vertices i and j are mapped far apart. Therefore, minimizing it is an attempt to ensure that if vertices i and j are close then yi and yj are close as well. With some simple algebraic formulations, we have

X ðyi yj Þ2 W i;j ¼ 2yT Ly;

ð14Þ

i;j

where L ¼ D W is the graph Laplacian (Chung, 1997) and D is a diagonal matrix whose entries are column (or row, since W is symP metric) sums of W, Dii ¼ j W ji . Finally, the minimization problem reduces to ﬁnd

y ¼ arg min yT Ly ¼ arg min yT Dy¼1

yT Ly : yT Dy

ð15Þ

The constraint yTDy ¼ 1 removes an arbitrary scaling factor in the embedding. The optimal ys can be obtained by solving the minimum eigenvalue eigen-problem: Ly ¼ kDy. If we choose a linear function, i.e., yi ¼ f ðxi Þ ¼ aT xi . Eq. (15) can be rewritten as:

a ¼ arg min

yT Ly aT XLX T a ; ¼ arg min T y Dy aT XDX T a

ð16Þ

where X ¼ ½x1 ; . . . ; xm . The optimal as are the eigenvectors corresponding to the minimum eigenvalue of eigen-problem: XLX T a ¼ kXDX T a. If we choose a function in RKHS, i.e.,

yi ¼ f ðxi Þ ¼

m X

aj Kðxj ; xi Þ;

ð17Þ

4. Experimental results and analysis Taiwan Economic Journal (TEJ) is an important provider of data on securities markets in Taiwan. This study used all of the ﬁnancial variables from the TEJ in forecasting enterprize credit rating. Speciﬁcally, these ﬁnancial variables include the following categories of information: company scale, ﬁnancial structure, solvency, business performance, proﬁtability, ﬁnancial coverage and cash ﬂow, for a total of 36 input variables. Most of these variables are derived from publicly disclosed information that companies are required to ﬁle with authorities like the securities and futures commission. These input variables are important for ﬁnancial analysis. Besides the ﬁnancial variables, this study also included the historical rating of each company to improve the rating accuracy. Information on enterprize credit ratings was also obtained from the TEJ, which provides the credit rating for every publicly traded Taiwanese company. A TEJ rating indicates company’s capacity to meet its ﬁnancial commitments over a one-year period, and is classiﬁed as: low risk, medium risk and high risk. A low risk rating indicates that an organization has an extremely strong capacity to meet its commitments, whereas a high risk rating indicates that an organization is likely to default. This study tested six models for corporate credit rating, including: one-vs-one, one-vs-rest, multi-class SVM, nearest neighbors, logistic regressions, and Bayesian networks. For SVMs, this study selected the polynomial kernel with two degrees owing to its good performance compared with other types of kernels. This study collected 88 Taiwanese high technology companies that are traded on the Taiwan’s security market. There were ﬁve ratings for each company during the period from 2000 to 2004. The data set was randomly divided into 10 parts, and 10-folds cross validation was applied to evaluate the model performance. 4.1. Performance comparison

j¼1

Kðxj ; xi Þ is the Mercer Kernel. Eq. (15) can be rewritten as:

a ¼ arg min

yT Ly aT KLK a ; ¼ arg min T a KDK a yT Dy

ð18Þ

where a ¼ ½a1 ; . . . ; am T . The optimal as are the eigenvectors corresponding to the minimum eigenvalue of eigen-problem: KLK a ¼ kKDK a. The procedure of KGE is stated below: 1. Constructing the adjacency graph: Let G denote a graph with m nodes. The ith node corresponds to the sample xi . We construct the graph G through the following three steps to model the local structure as well as the label information: (a) put an edge between nodes i and j if xi is among p nearest neighbors of xj or xj is among p nearest neighbors of xi . (b) Put an edge between nodes i and j if xi shares the same label with xj . (c) Remove the edge between nodes i and j if the label of xi is different from that of xj . 2. Choosing the weights: W is a sparse symmetric m m matrix with W ij having the weight of the edge joining vertices i and j. (a) If there is no edge between i and j, W ij ¼ 0. (b) Otherwise,

W ij ¼

1=lk

if xi and xj both belong to the kth class;

d sði; jÞ otherwise;

where lk is the number of labeled samples in the kth class. 0 < d < 1 is the parameter to adjust the weight between supervised information and unsupervised neighbor information. sði; jÞ is a function to evaluate the similarity between xi kxand 2 xj and there are two variai xj k tions: (1) heat kernel; sði; jÞ ¼ e 2r2 and (2) simple-minded; sði; jÞ ¼ 1.

Table 1 shows that on average the one-vs-one SVM outperforms other SVM classiﬁers. Comparing with traditional classiﬁers, Table 1 also reveals that one-vs-one SVM performs best. Next, the KGE is employed to enhance the performance of these SVM classiﬁers. We compared the performance improvement of KGE with PCA and independent component analysis (ICA, Hyvärinen, Karhunen, & Oja, 2001). We set the dimension of subspace to ﬁve for all these schemes. The results are listed in Table 2. On the other hand, we also compared the performance improvement of KGE with a famous feature selection algorithm, the recursive feature elimination (RFE) method proposed by Guyon, Weston, Barnhill, and Vapnik (2002). RFE algorithm recursively eliminates input variables to identify the most important 5, 10, 15, and 20 feature subsets for comparison. The results are listed in Table 3. Tables 2 and 3 also list pure SVM models without any dimensionality reduction and feature selection schemes for comparison. Table 2 shows that only KGE algorithm signiﬁcantly improves the performance of these SVM classiﬁers. The one-vs-one SVM with KGE have the best accuracy rates than other multi-class SVM classiﬁers. These results fully demonstrate that in real rating

Table 1 Forecasting performance (error rate) of every model

Nearest neighbors Logistic regression Bayesian network 1-vs-1 Pure SVM 1-vs-rest Pure SVM Pure MSVM

2000

2001

2002

2003

2004

0.2661 0.2564 0.1923 0.2161 0.2143 0.2375

0.2232 0.2676 0.1690 0.2232 0.1821 0.2804

0.2286 0.1944 0.1806 0.1625 0.1750 0.2768

0.2446 0.2557 0.2208 0.1929 0.2679 0.3304

0.3427 0.2597 0.2468 0.1786 0.2661 0.2143

7518

S.-C. Huang / Expert Systems with Applications 36 (2009) 7515–7518

Table 2 Performance comparison (error rate) of three dimensionality reduction schemes 2000

2001

2002

2003

2004

1-vs-1 Pure SVM 1-vs-rest Pure SVM Pure MSVM

0.2161 0.2143 0.2375

0.2232 0.1821 0.2804

0.1625 0.1750 0.2768

0.1929 0.2679 0.3304

0.1786 0.2661 0.2143

1-vs-1 + PCA 1-vs-rest + PCA MSVM + PCA

0.1661 0.2036 0.2786

0.2643 0.2500 0.2214

0.2393 0.2411 0.4375

0.2339 0.2071 0.2589

0.2964 0.2821 0.3946

1-vs-1 + ICA 1-vs-rest + ICA MSVM + ICA

0.2750 0.3250 0.3518

0.1929 0.1929 0.2214

0.2196 0.2161 0.2054

0.3071 0.3714 0.3589

0.2911 0.4268 0.3304

1-vs-1 + KGE 1-vs-rest + KGE MSVM + KGE

0.0625 0.0875 0.2161

0.1536 0.1393 0.3214

0.0536 0.0804 0.2732

0.1286 0.1661 0.1679

0.0500 0.0625 0.1500

Table 3 Performance comparison (error rate) of KGE and RFE 2000

2001

2002

2003

2004

1-vs-1 Pure SVM 1-vs-rest Pure SVM Pure MSVM

0.2161 0.2143 0.2375

0.2232 0.1821 0.2804

0.1625 0.1750 0.2768

0.1929 0.2679 0.3304

0.1786 0.2661 0.2143

1-vs-1 + RFE 1-vs-1 + RFE 1-vs-1 + RFE 1-vs-1 + RFE

0.1250 0.1375 0.1250 0.1750

0.2536 0.1946 0.2214 0.2625

0.0250 0.0679 0.1107 0.1339

0.1393 0.0893 0.1018 0.1554

0.1536 0.1393 0.1268 0.1518

0.1268 0.1625 0.1500 0.1893

0.2786 0.1946 0.1679 0.1643

0.0393 0.0929 0.1089 0.1500

0.1536 0.1661 0.1643 0.2286

0.1893 0.1929 0.1393 0.2429

5 10 15 20

0.3071 0.2518 0.2375 0.2625

0.3054 0.2929 0.2643 0.2786

0.3304 0.3161 0.3036 0.2750

0.2536 0.2661 0.3179 0.2679

0.2661 0.2643 0.2786 0.2768

1-vs-1 + KGE 1-vs-rest + KGE MSVM + KGE

0.0625 0.0875 0.2161

0.1536 0.1393 0.3214

0.0536 0.0804 0.2732

0.1286 0.1661 0.1679

0.0500 0.0625 0.1500

5 10 15 20

1-vs-rest + RFE 1-vs-rest + RFE 1-vs-rest + RFE 1-vs-rest + RFE MSVM + RFE MSVM + RFE MSVM + RFE MSVM + RFE

5 10 15 20

problems the data is not sampled from a linear subspace. Hence linear algorithms such as PCA and ICA fail to extract key information containing in the data. Considering manifold based techniques for dimensionality reduction in credit rating problems are more effective. Comparing KGE with RFE, Table 3 reveals one-vs-one SVM with KGE is the most cost-efﬁcient model because it has the fewest dimensionality of subspace and the best accuracy rate. Clearly, the accuracy rates of the pure one-vs-one, one-vs-rest, and MSVM models, which contains all of the variables, are lower than for the models containing fewer variables. That is, more information does not necessarily improve accuracy. Table 3 also shows that the performance improvement owing to RFE is lower than KGE no matter 5, 10, 15, and 20 key features are selected by RFE. The subspace formed by KGE containing sufﬁcient information or latent structures to discriminate or represent the data, while the subset formed by RFE does not. 5. Conclusions Corporate credit ratings provide important information on credit risk for banks or investors in ﬁnancial markets. This study inte-

grated KGE with SVM to create a novel classiﬁer for rating predictions. The performance of the hybrid model was examined using a data set comprising a large amount of ﬁnancial information regarding Taiwanese high technology companies. The results show that our new classiﬁcation model is more accurate than pure SVM classiﬁers, and outperforms traditional techniques when applied to multiple-class credit rating problems. Kernel graph embedding effectively enhances the classiﬁcation performance of SVMs. Empirical results showed that one-vs-one SVM with KGE outperforms other multi-class SVMs. The accuracy rates of the three types of SVMs using all of the input variables are lower than for models using a smaller number of more important latent variables. Compared with traditional dimensionality reduction schemes, performance improvement resulting from KGE is signiﬁcant. Restated, nonlinear dimensionality reduction is key technique for improving the performance of multi-class classiﬁers. Future research may consider non-ﬁnancial and macroeconomic variables for SVM inputs. But including more information does not guarantee higher accuracy. In this situation, dimensionality reduction and feature selection are important strategies for enhancing classiﬁer performance. What dimensionality reduction schemes is efﬁcient to incorporate with SVM classiﬁers need further study. References Belkin, M., & Niyogi, P. (2001). Laplacian eigenmaps and spectral techniques for embedding and clustering. In Advances in neural information processing systems 14 (p. 585V591). Cambridge, MA: MIT Press. Cai, D., He, X., & Han, J. (2007). Spectral regression for dimensionality reduction. Department of Computer Science, Technical Report No. 2856, University of Illinois at Urbana-Champaign (UIUCDCS-R-2007-2856). Chung, F. R. K. (1997). Spectral graph theory. In Regional conference series in mathematics (Vol. 92). AMS. Cristianini, N., & Shawe-Taylor, J. (2000). An introduction to support vector machines. Cambridge University Press. Desai, V. S., Crook, J. N., & Overstreet, G. A. Jr., (1996). A comparison of neural networks and linear scoring models in the credit union environment. European Journal of Operations Management, 95, 24–37. Guyon, I., Weston, J., Barnhill, S., & Vapnik, V. (2002). Gene selection for cancer classiﬁcation using support vector machines. Machine Learning, 46, 389–422. Henley, W. E., & Hand, D. J. (1997). Construction of a k-nearest neighbour creditscoring system. IMA Journal of Management Mathematics, 8, 305–321. Hyvärinen, A., Karhunen, J., & Oja, E. (2001). Independent component analysis. Wiley Interscience. Roweis, S., & Saul, L. (2000). Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500), 2323–2326. Schoelkopf, B., Burges, C. J. C., & Smola, A. J. (1999). Advances in kernel methods – Support vector learning. Cambridge, MA: MIT Press. Steenackers, A., & Goovaerts, M. J. (1989). A credit scoring model for personal loans. Insurance Mathematics Economics, 8, 31–34. Stepanova, M., & Thomas, L. C. (2001). PHAB scores: Proportional hazards analysis behavioural scores. The Journal of the Operational Research Society, 52, 1007–1016. Tenenbaum, J., de Silva, V., & Langford, J. (2000). A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500), 2319–2323. Vapnik, V. N. (1999). The nature of statistical learning theory (2nd ed.). Springer. West, D. (2000). Neural network credit scoring models. Computers and Operations Research, 27, 1131–1152. Weston, J., & Watkins, C. (1999). Support vector machines for multi-class pattern recognition. ESANN’99. Yan, S., Xu, D., Zhang, B., Zhang, H. J., Yang, Q., & Lin, S. (2007). Graph embedding and extension: A general framework for dimensionality reduction. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(1), 40–51. Yobas, M. B., Crook, J. N., & Ross, P. (2000). Credit scoring using neural and evolutionary techniques. IMA Journal of Management Mathematics, 11, 111–125.

Integrating nonlinear graph based dimensionality reduction schemes with SVMs for credit rating forecasting

Integrating nonlinear graph based dimensionality reduction schemes with SVMs for credit rating forecasting

Recommend Documents