Credit scoring and rejected instances reassigning through evolutionary computation techniques

Credit scoring and rejected instances reassigning through evolutionary computation techniques

Expert Systems with Applications 24 (2003) 433–441 www.elsevier.com/locate/eswa Credit scoring and rejected instances reassigning through evolutionar...

155KB Sizes 0 Downloads 32 Views

Expert Systems with Applications 24 (2003) 433–441 www.elsevier.com/locate/eswa

Credit scoring and rejected instances reassigning through evolutionary computation techniques Mu-Chen Chena,*, Shih-Hsien Huangb a

Department of Business Management, Institute of Commerce Automation and Management, National Taipei University of Technology, Taipei, Taiwan, ROC b Army, Taiwan, ROC

Abstract The credit industry is concerned with many problems of interest to the computation community. This study presents a work involving two interesting credit analysis problems and resolves them by applying two techniques, neural networks (NNs) and genetic algorithms (GAs), within the field of evolutionary computation. The first problem is constructing NN-based credit scoring model, which classifies applicants as accepted (good) or rejected (bad) credits. The second one is better understanding the rejected credits, and trying to reassign them to the preferable accepted class by using the GA-based inverse classification technique. Each of these problems influences on the decisions relating to the credit admission evaluation, which significantly affects risk and profitability of creditors. From the computational results, NNs have emerged as a computational tool that is well-matched to the problem of credit classification. Using the GA-based inverse classification, creditors can suggest the conditional acceptance, and further explain the conditions to rejected applicants. In addition, applicants can evaluate the option of minimum modifications to their attributes. q 2003 Elsevier Science Ltd. All rights reserved. Keywords: Credit scoring; Classification; Inverse classification; Neural networks; Genetic algorithms

1. Introduction With the rapid growth in credit industry, credit scoring models have been extensively used for the credit admission evaluation. The credit scoring models are developed to categorize applicants as either accepted (good) or rejected (bad) credits with respect to their characteristics such as age, income and marital condition. Creditors accept the application provided that it is expected to repay the financial obligation, and vice versa. Creditors can construct the classification rules based on the data of the previous accepted and rejected applicants. With sizable loan portfolios, even a slight improvement in credit scoring accuracy can reduce the creditors’ risk and translate considerably into future savings. From the Brill (1998) study, the benefits of credit scoring include cost reduction in credit analysis, faster credit evaluation, closer monitoring of existing accounts and improvement in cash flow and collections. Linear discriminant model (Reichert, Cho, & Wagner, 1983) is one of the first credit scoring models, and it is commonly used today. Linear discriminant analysis (LDA) * Corresponding author. Tel.: þ886-2-2771-2171x3417; fax: þ 886-22776-3964. E-mail address: [email protected] (M.-C. Chen).

for credit scoring has been challenged due to the categorical nature of the credit data and the truth that the covariance matrices of the accepted and rejected classes are likely to be unequal (West, 2000). Practitioners and researchers have also applied statistical techniques to develop more sophisticated models for credit scoring, which involve logistic regression analysis (LRA) (Henley, 1995), k nearest neighbor (KNN) (Henley & Hand, 1996) and decision tree (Davis, Edelman, & Gammerman, 1992). Classification is a commonly encountered decision making tasks in business. Categorizing an object into a predefined group or class based on a number of observed attributes related to that object is a typical classification problem (Zhang, 2000). In addition to credit scoring and corporate distress prediction, neural networks (NNs) have been successfully applied to a variety of real world classification tasks in industry, business and science. A number of performance comparisons between neural and conventional classifiers have been made by many studies (Curram & Mingers, 1994; Markham & Ragsdale, 1995). Conventional statistical classification procedures such as LDA and LRA are constructed on the Bayesian decision theory. In these classification techniques, an underlying probability model must be assumed in order to calculate

0957-4174/03/$ - see front matter q 2003 Elsevier Science Ltd. All rights reserved. doi:10.1016/S0957-4174(02)00191-4

434

M.-C. Chen, S.-H. Huang / Expert Systems with Applications 24 (2003) 433–441

the posterior probability upon which the classification decision is made. In credit industry, NN has recently been claimed to be an accurate tool for credit analysis (Desai, Crook, & Overstreet, 1996; Malhotra & Malhotra, 2002; West, 2000). Desai et al. (1996) have explored the ability of NN and traditional statistical techniques such as LDA and LRA, in constructing credit scoring models. Their results indicated that NN shows promise if the performance measure is percentage of bad loans accurately classified. However, if the performance measure is percentage of good and bad loans accurately classified, LRA is as good as NN. The percentage of bad loans correctly classified is an important performance measure for credit scoring models since the cost of granting a loan to a defaulter is much larger than that of rejecting a good applicant (Desai et al., 1996). West (2000) has investigated the accuracy of quantitative models commonly used by the credit industry. The results indicated that NN can improve the credit scoring accuracy. West also suggested that LRA is a good alternative to NN. Additionally, LDA, KNN, and classification and regression tree (CART) did not produce encouraging results. In the field of corporate failure analysis, which is also an important classification problem in business, NNs were also reported to be successful. Coats and Fant (1993) have utilized both LDA and NN to classify firms obtained from COMPUSTAT as either viable or distress. Coats and Fant concluded that NN is more accurate than LDA, remarkably for predicting the distressed companies. Salchenberger, Cinar, and Lash (1992) reported that NN performs as well as or better than the LRA in the prediction of the financial health of savings and loans. From the computational results made by Tam and Kiang (1992), NN is most accurate in bank failure prediction, followed by LDA, LRA, KNN and decision trees. From the extensive survey of NN applications in business (Vellido, Lisboa, & Vaughan, 1999), it indicates that NN shows promise in various areas where nonlinear relationships are believed to exist within the datasets, and traditional statistical approaches are deficient. In credit prediction, the nonlinear features of NNs make them a potential alternative to traditional parametric (e.g. LDA and LRA) and nonparametric (e.g. KNN and decision tree) methods. However, NN is commonly considered as a blackbox technique without logic or rule-based explanations for the input – output approximation. A main shortage of applying NN for credit scoring is the difficulty in explaining the underlying principle for the decision to rejected applications (West, 2000). In order to investigate the possibility of translating a rejected decision into the accepted class for applicants, creditors can suggest modifications to the adjustable attributes with minimum modification cost. This approach lessens to a degree the deficiency of applying NN for credit scoring in explaining the rationale for the decision to rejected applications. Creditors can suggest the conditional

acceptance, and further explain the conditions to rejected applications. On the other hand, applicants can evaluate the option of minimum modifications to their attributes. Some of the factors are adjustable, and they may change currently or in the near future. This study focuses on two interesting credit analysis problems and resolves them by applying two techniques, NNs and GAs, within the field of evolutionary computation. The first problem is constructing the NN-based credit scoring model. The second one is better understanding the rejected credits, and trying to reassign them to the preferable accepted class by using the GA-based inverse classification technique. The rest of paper is organized as follows. Section 2 presents the formulation of inverse classification. Section 3 introduces the NN-based credit scoring and GA-based inverse classification technique. The computational results of the illustrative example are given in Section 4. Finally, Section 5 makes a conclusion to this study.

2. Formulation of inverse classification The credit scoring issue of rejected instance analysis addressed in this paper is a particular inverse classification problem defined by Mannino and Koushik (2000). The inverse classification problem determines the minimum cost alternative by which a reference instance A ¼ {a1 ; a2 ; …; an }; currently categorized in class Ci ; i [ {1; 2; …; m}; can have its attribute values adjusted such that it is categorized in a different class Cj ; i – j [ {1; 2; …; m}: The process of classifying an instance A is a mapping from the set of attribute values {a1 ; a2 ; …; an } and classification model M to exactly one of m classes: classify ðA; MÞ ! Ci ; i [ {1; 2; …; m}: Let X ¼ {x1 ; x2 ; …; xn } denote an adjusted instance of A after one or more attribute values have been manipulated, and TCðA; XÞ denote the total cost of attribute adjustments. Mathematically, the inverse classification problem can be formulated as follows Minimize

TCðA; XÞ

Subject to : classify ðX; MÞ ! Cj

ð1aÞ ð1bÞ

The growing application of classifiers in credit scoring system suggests that creditors can significantly benefit from this inverse classification formulation. It lessens to a degree the deficiency of applying NN for credit scoring in explaining the rationale of rejected applications. A number of factors including account longevity, credit history, employment category, assets owned, years residence, amount of loan, etc. are used by creditors in arriving the credit scoring. Some of the factors are adjustable, and they may change currently or in the near future. In order to investigate the possibility of translating rejected credits into the accepted class, which is preferable to applicants, creditors can perform the minimum possible modifications

M.-C. Chen, S.-H. Huang / Expert Systems with Applications 24 (2003) 433–441

to the adjustable attributes by resolving the above mathematical model. From the above description, the inverse classification for credit scoring can be formulated as an optimization problem. The nonlinear features of NN make the formulated mathematical model a complicated optimization problem. In this study, GAs are applied to resolve the optimization model for inverse classification.

3. The proposed methodology As mentioned in Section 1, NNs outperform the conventional statistical techniques such as LDA, LRA, decision tree and KNN in building credit scoring models. In this study, NN is firstly used to build the credit scoring model. It is developed to classify applicants as either accepted or rejected with respect to their characteristics. By using the inverse classification technique, creditors can give more details of conditions to rejected applicants. A GAbased optimization approach is then adopted to resolve the inverse classification model presented in Section 2. Fig. 1 illustrates the procedure of our approach for the credit scoring and rejected instance analysis. 3.1. The NN-based credit scoring model In the stage of building credit scoring models, a backpropagation (BP) network is used with historical credit data. The BP NN is a supervised type of learning. Except credit scoring, it has been used to generate remarkable solutions for many application areas (Vellido et al., 1999). Readers are assumed to have basic knowledge of NNs. For an in-depth discussion of NNs, readers are referred to Freeman and Skapura (1992). The statistical classification models perform favorably only when the essential assumptions are satisfied.

Fig. 1. The proposed approach for credit analysis.

435

The usefulness of these methods relies heavily on the various assumptions or conditions under which the classification models are built (Zhang, 2000). One major difficulty in applying statistical methods is that users must have a comprehensive knowledge of both data properties and model capabilities before the models can be profitably implemented. NNs have emerged as a computational technology that is well-matched to the problem of classification. In the classification realm, NNs can be successfully applied to construct an approximation of the independent variables into group predictions (Markham & Ragsdale, 1995; Zhang, 2000). NNs learn the approximation for a given application using a training data set and later apply this approximation to other set of testing data. In contrast to traditional statistical techniques, the training of NN does not require the knowledge of the underlying relationships between input and output variables. The relationships, whether linear or nonlinear, are automatically incorporated during the network learning stage. The associative abilities of NNs make them robust in situations where input is noisy and incomplete. The BP network used herein has an input layer, an intermediate hidden layer, and an output layer. The BP-based credit scoring method is succinctly illustrated in Fig. 2. The input nodes represent the applicant’s characteristics, and the output node represents the identified class (say 0 for rejected and 1 for accepted). The BP learning involves three stages: the feed-forward of the input training pattern, the calculation of the associated error, and the adjustment of the weights. After the network reaches a satisfactory level of performance, it learns the relationships between independent variables (applicant’s attributes) and dependent variable (credit class). The trained BP network can then be adopted as a credit scoring model to classify the credit as either good or bad by inserting the values of applicant’s attributes. The BP-based credit scoring procedure can be described as the following steps

Fig. 2. The BP-based credit scoring architecture.

436

M.-C. Chen, S.-H. Huang / Expert Systems with Applications 24 (2003) 433–441

Step 1 Data pre-processing. The collected credit data are transformed to training and testing patterns for a BP network. The input-target patterns, (X, Y), are presented to the BP network. The input patterns X (vectors of applicants’ attributes) are received by the input nodes, and the target patterns Y (credit class) are associated with the output node. Step 2 Network architecture and learning parameters selection. The architecture of the BP network used in this study has a three-layer configuration. One node in the input layer is incorporated to an applicant’s attribute, and the only node in the output layer is incorporated to the credit identified as ‘accepted’ or ‘rejected’. To start with reasonable network architecture, the connection weights and biases of the network are set to small uniformly random numbers between 2 0.5 and 0.5. The pattern values provided for the input nodes are linearly mapped to a range between 0 and 1.0. The outputs of BP will allow values between 0 and 1.0. The user also specifies the learning parameters including learning rate, momentum, maximum number of epochs and desired level of error. However, some guidelines for specifying these parameters can be found in the literature (Freeman & Skapura, 1992). Generally, the learning rate is set between 0.1 and 0.3, and the momentum is set between 0.8 and 0.99. The number of hidden nodes may be set to (number of input nodes þ number of output nodes)/2 þ squared root of (number of training patterns). Step 3 Network training and testing. The training and testing patterns are fed into the BP network to learn the relationships between input attributes and credit class. BP is used for a two-class pattern classification herein. A simple threshold scheme is sufficient for a discriminative function to divide the feature space into two categories in a two-class classification problem. The target output is 1 in the training set for one class (good, accepted) and is 0 (bad, rejected) for the other class. While evaluating the BP network, if the output value is greater than or equal to 0.5, the input sample is assigned to the accepted class; else it is assigned to the rejected class. The errors are also applied to measure the performance of networks. The lower desired error level makes the training time longer, but the classification accuracy is higher. The trained network is then presented with input data it has never seen in order to observe the output generated. Step 4 Credit scoring model generation. Training a network is an iterative process that continues until the error either converges to a predetermined threshold or stabilizes. Once the obtained error is less than the specified error level or the maximum number of training epochs is reached, the generated activation values of the output patterns are near enough to the target patterns. Using the activation functions, weights and biases, the classification function can be generated from the trained BP network.

The BP-based credit analysis is usually more accurate than traditional statistical techniques. It is difficult to present the classification function mathematically by using the weights, biases and activation functions since the discriminative function of NN is often much more complicated than that of traditional techniques. Therefore, the BP-based model has difficulty in explaining the justification for the decision to rejected credits, and providing suggestions to rejected applicants. In this study, the complicated discriminative function is inserted to the mathematical model formulated in Section 2, and the GA-based optimization algorithm is then triggered for the post-classification analysis (inverse classification). By using the computer codes of discriminative functions obtained from the BPbased credit scoring models, the optimization module can easily investigate the possibility of translating rejected decision into accepted decision for applicants by making the minimum modifications to the adjustable attributes. 3.2. The GA-based inverse classification Genetic algorithms (GAs) are a part of evolutionary computing, which is a rapidly growing area of optimization. GAs are inspired by Darwin’s theory about evolution. The solution found by GAs is often considered as a good one, because it is not often possible to prove what is the real optimum. GAs are useful where the search space is large, nonlinear and noisy, and solutions are ill-defined a priori (Goldberg, 1989; Gen & Cheng, 1997). GAs have several distinguishing facets borrowed from an abstract formulation of evolution system. They are: (1) a configuration of problem solutions must be obtained which can be manipulated through crossover and mutation to find other solutions to the problem; (2) depending on an initial population (a set of initial solutions), the next generation of solutions can be obtained through the evolving process; and (3) the evaluation of fitness for each solution, which supplies its likelihood of surviving into the next generation. GAs have been used to successfully address diverse optimization problems. For a complete discussion of GAs and their applications refer to Goldberg (1989) and Gen and Cheng (1997). In approaching the inverse classification problem of credit analysis, the solution is represented in real value, which has been found to be more effective than that in binary string from the literature (Gen & Cheng, 1997). The genetic operators of roulette wheel selection, arithmetic mutation and nonuniform mutation are adopted for their effectiveness in addressing constrained nonlinear programs. Constraints are augmented to the objective function using penalty functions. In this study, the proposed GA-based optimization method for inverse classification has been implemented in Cþ þ . Implementation of a GA scheme for optimization needs firstly to define several basic items. They are described as follows.

M.-C. Chen, S.-H. Huang / Expert Systems with Applications 24 (2003) 433–441

Encoding. A real value encoding is used herein, so that each solution of decision variables is encoded as a vector of real value coefficients. Initial population. An initial population of strings is generated in a random manner. Each solution consists of a string of real values that are randomly drawn from a uniform distribution and then plugged into the model for evaluation. Evaluation of fitness. The real values in X are then inserted into the mathematical model (Eq. (1)) to obtain the relative function value. For the roulette wheel selection method, it is required to transform the objective values into fitness values in such a way that the fitter one has the larger fitness value. In the case of minimization problems, the fitness value for a chromosome fitness (X) can be defined as fitnessðXÞ ¼ FðXÞ þ lFmin l

ð2Þ

where FðXÞ is the evaluation function described below; and Fmin is the minimum of evaluation function values for the current population. Due to the nonlinear nature of NN, the developed mathematical model of credit analysis is a constrained nonlinear program. The penalty technique is perhaps the most common one used to handle the infeasible solutions in GAs for constrained optimization problems. In general, this technique transforms the constrained problems into unconstrained ones by penalizing infeasible solutions, in which a penalty term is added to the objective function for any violations of constraints. In the case of minimization problem, the evaluation function FðXÞ with penalty term takes the additional form expressed as FðXÞ ¼ 2½TCðA; XÞ þ pðXÞ

ð3Þ

where TCðA; XÞ is the objective function mentioned in Section 2; and pðXÞ is the penalty term. The penalty function is defined as ( pðXÞ ¼ 0 if X is feasible ð4Þ pðXÞ . 0 otherwise Population size and number of generations. One of the advantages of GAs over traditional search techniques is that they search for many solutions in the solution space in parallel. The size of parallel search is called the population size ( pop_size), which is the number of strings in each generation. The population size is typically problem independent and needs to be determined experimentally. The population is descended from one generation to the next in order to search for a better solution to the problem. The search will normally converge to some near-optimal points after a certain number of generations. The number of generations (max_gen) required to reach convergence is also problem independent and related to other parameters. Therefore, it has to be determined experimentally. Selection. In most practice, a roulette wheel approach is adopted as the selection procedure; it belongs to the fitnessproportional selection and can select a new population with

437

respect to the probability distribution based on fitness values. The roulette wheel can be constructed as follows 1. Calculate the fitness value fitnessðXi Þ for each chromosome Xi ; i ¼ 1; 2; …; pop_size: 2. Calculate the total fitness for the population FT ¼

pop_size X

fitnessðXi Þ

i¼1

3. Calculate selection probability pi for each chromosome Xi pi ¼

fitnessðXi Þ ; FT

i ¼ 1; 2; …; pop_size:

The selection process begins by spinning the roulette wheel a number of times; each time, a single chromosome Xi is selected for a new population with probability pi : Cross-over. The crossover operator applied here is arithmetic crossover (Gen & Cheng, 1997). The arithmetic crossover is defined as the combination of two chromosomes X1 and X2 as follows X 01 ¼ rX1 þ ð1 2 rÞX2 ;

X 02 ¼ rX2 þ ð1 2 rÞX1

ð5Þ

where r [ ð0; 1Þ: The probability of crossover is set as pc ; i.e. on average, pc £ 100% of chromosomes undergo crossover. Mutation. The nonuniform mutation (Gen & Cheng, 1997) is utilized in this algorithm. For a given parent X, if the element xk of it is selected for mutation, the resulting offspring is ½x1 ; …; x0k ; …; xn ; where x0k is randomly selected from the two possible choices (with equal probability) x0k ¼ xk þ Dðz; xU k 2 xk Þ;

x0k ¼ xk 2 Dðz; xk 2 xLk Þ

ð6Þ

L where xU k and xk are the upper and lower bounds for xk : The function Dðz; qÞ returns a value in the range in (0,q) such that the value of Dðz; qÞ approaches to 0 as t increases (t is the current generation) as follows  b t Dðz; qÞ ¼ q £ r £ 1 2 ð7Þ max_gen

where r is a random number from (0,1), t is the current generation and b is a parameter determining the degree of nonuniformity. The probability of mutation is set as pm ; i.e. on average, pm £ 100% of total elements in population would undergo mutation. Every element has an equal chance to be mutated. Cull mechanism. After crossover and mutation, the freshly generated offspring and mutants are added to the population, so that total 2 £ pop_size chromosomes (solutions) are present. The population of 2 £ pop_size is ranked according to fitness value, and the pop_size lowest

438

M.-C. Chen, S.-H. Huang / Expert Systems with Applications 24 (2003) 433–441

4.1. Results of BP-based credit scoring

Fig. 3. The GA-based optimization algorithm for inverse classification.

ranked members are culled, maintaining a constant population size of pop_size. The flowchart of the proposed GA-based optimization approach for inverse classification is illustrated in Fig. 3.

4. Numerical illustrations A real world dataset obtained from UCI Repository of Machine Learning Databases (Murphy & Aha, 2001) is adopted herein to evaluate the predictive accuracy of BPbased credit scoring model, and the capability of GA-based inverse classification. The credit scoring results of BP are benchmarked to those generated by LDA and CART.

The credit dataset consists of 307 instances of creditworthy applicants and 383 instances where credit is not creditworthy. Each instance contains 15 attributes {a1 ; a2 ; …; a15 } and one class attribute d (accepted or rejected). This dataset is interesting because there is a good mixture of attributes: continuous, nominal with small numbers of values, and nominal with larger numbers of values. There are also a few missing values. To protect the confidentiality of data, the attributes names and values have been changed to meaningless symbolic data. The credit dataset is randomly partitioned into training and independent test sets using a five-fold cross validation. Ten repetitions are used for each trial. The test set is used to guarantee that our results are valid and can be generalized to make predictions about new data. Each of the five random partitions performs as an independent holdout test set for the credit scoring model trained with the rest of four partitions. The benefits of cross validation are that the impact of data dependency is minimized and the reliability of results is improved (West, 2000). In addition, the credit scoring model is developed with a huge portion of the accessible data (80% in this case) and all the data is utilized to test the trained models. In this paper, the performance of BP classifier is benchmarked against LDA and CART for credit scoring applications. Several options of the NN configurations (Table 1) are evaluated, in which the 15-28-1 is found to obtain the better results. Additionally, the learning rate, momentum and training epochs are set to 0.1, 0.9 and 5000, respectively. The results are averages of accuracy rate determined for each of the five independent holdout data set partitions (testing accuracy) used in the five-cross validation methods. Since the training of NN is a stochastic process, their accuracy rates for the five dataset partitions are themselves averages of 10 repetitions. The learning parameters including learning rate and momentum may influence the learning performance of BP. Therefore, we further experiment by using various settings of these two parameters with the 15-28-1 architecture. The computational results are summarized in Table 2. From this Table 1 Summary results of BP with various network architectures Architecture

Rg

Rb

Ra

T

15-28-1 15-29-1 15-30-1 15-31-1 15-32-1 15-33-1

86.57 86.90 86.21 86.21 87.23 86.90

89.02 87.62 88.74 88.74 87.33 88.18

87.92 87.31 87.61 87.61 87.31 87.61

302.2 337.4 327.8 337.8 329.6 335.2

Rg ; hit rate of accepted instances; Rb ; hit rate of rejected instances; Ra ; overall hit rate; T, training CPU time (s).

M.-C. Chen, S.-H. Huang / Expert Systems with Applications 24 (2003) 433–441 Table 2 Overall hit rates of various learning parameters for the 15-28-1 architecture

M ¼ 0.1 M ¼ 0.3 M ¼ 0.5 M ¼ 0.9

a ¼ 0.01

a ¼ 0.1

a ¼ 0.03

a ¼ 0.5

87.46 87.46 86.55 86.85

86.54 86.39 86.55 87.92

86.55 86.70 87.31 87.16

87.92 87.92 87.46 86.39

table, the performance of BP is not sensitive to the setting of learning rate and momentum in this case. The classification results of the credit dataset by using BP, LDA and CART are summarized in Table 3. From the obtained results, BP is more reliable than LDA and CART. Provided that the performance measure is percentage of good credits accurately classified, LDA and CART models are better than BP models. For the accuracy percentage of rejected (bad) credits, BP, in particular, perform quite better than LDA and CART (refer to hit rate of rejected instances Rb in Table 3). The percentage of bad credits correctly classified is an important performance measure for credit scoring models since the cost of granting a loan to a defaulter is much larger than that of rejecting a good applicant (Desai et al., 1996). Therefore, it is worthy for BP to necessitate more computational efforts than LDA and CART. Although, BP takes about 5 minutes on a Pentium III PC, the generation of credit scoring model is generally implemented off-line. Additionally, there exists no theoretical foundation to determine the optimal network architecture and parameters for a given classification problem. Therefore, they may be determined using experimentation. However, some selection guidelines can be found in the literature. From the computational results shown in Tables 2 and 3, the selection of network architecture and parameters follows the guidelines as discussed in Section 3.1.

For credit scoring, the accepted class is the preferable one to applicants. If an application is rejected after credit evaluation, creditors can suggest conditional acceptance, and further explain the conditions to rejected applicants. In order to investigate the possibility of translating a rejected decision into the preferable class (accepted one) for Table 3 The comparison of BP, LDA and CART for credit scoring

a

applicants, creditors can perform the minimum modifications to the adjustable attributes. According to the description presented in the previous sections, the GAbased inverse classification technique can determine the minimum cost alternative by which an instance A ¼ {a1 ; a2 ; …; an }; currently categorized in rejected class, can have its values of adjustable attributes changed such that it is categorized in the preferable accepted class. A simple threshold scheme is sufficient for a discriminative function to divide the feature space in this two-class (accepted or rejected) classification problem. The target output is ‘1’ in the training set for the accepted class and is ‘0’ for the rejected class. While evaluating the discriminative function, if the output result of this function is greater than or equal to 0.5, the input sample is assigned to the accepted class (C1); else it is assigned to the rejected class (C2). Hence, ( If DðAi Þ $ 0:5; OðAi Þ ¼ 1 and Ai [ C1 ð8Þ If DðAi Þ , 0:5; OðAi Þ ¼ 0 and Ai [ C2 where D is the discriminative function obtained from the trained BP network, Ai is the input feature set of the ith sample, and Oi is the output credit class associated with Ai : Three rejected instances (A1 ; A2 and A3 ) from the dataset are selected to test the proposed GA-based optimization method for inverse classification. Table 4 presents the attribute type and lower (L) and upper (U) bounds of attributes for the credit dataset. The original attribute values of these three test instances are also included in this table. An assumption is made that only the continuous attributes can be possibly adjusted to reassign the instance to the accepted class. For each test instance, 2– 6 continuous attributes (refer to P1 – P5 in Tables 5– 7) are selected to adjust their original attribute values. For example, problem Table 4 Three rejected instances for inverse classification

4.2. Results of GA-based inverse classification

Rg Rb Ra T

439

BP

LDA

CART

86.90 88.75 87.92 302.2

92.91 80.86 86.09 ,1a

91.90 84.24 87.78 ,1a

The required CPU time of training is relatively short, and less than 1 s.

Attribute

Type

a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 a11 a12 a13 a14 a15 d OðAÞ

Categorical Continuous Continuous Categorical Categorical Categorical Categorical Continuous Categorical Categorical Continuous Categorical Categorical Continuous Continuous Decision Output

(L, U)

(14, 77) (0, 28)

(0,29)

(0, 67)

(0, 2000) (0, 100,000)

A1

A2

A3

0 39.83 0.50 0 0 6 0 0.25 0 1 0 1 2 288 0 0 0

0 52.33 1.38 1 1 0 1 9.46 0 1 0 0 0 200 100 0 0

0 32.75 2.34 0 0 1 1 5.75 1 1 0 0 0 292 0 0 0

440

M.-C. Chen, S.-H. Huang / Expert Systems with Applications 24 (2003) 433–441

Table 5 Results of inverse classification for instance 1 A1

Table 7 Results of inverse classification for instance 3 A3

Attribute

Original

P1 ðX1 Þ

P2 ðX1 Þ

P3 ðX1 Þ

P4 ðX1 Þ

P5 ðX1 Þ

Attribute

Original

P1 ðX3 Þ

P2 ðX3 Þ

P3 ðX3 Þ

P4 ðX3 Þ

P5 ðX3 Þ

a2 a3 a8 a11 a14 a15 OðA1 Þ OðX1 Þ TC T

39.83 0.50 0.25 0 288 0 0 0

40.17 5.64

40.94 5.91 0.33

40.95 5.79 0.67 0.85

40.98 5.00 0.98 0.99 266.07

N/A N/A N/A N/A

N/A N/A N/A N/A N/A

0 1 7.68 6.41

0 1 29.3 6.59

32.75 2.34 5.75 0.00 292.00 0.00 0 0

N/A N/A N/A

0 1 6.6 6.26

a2 a3 a8 a11 a14 a15 OðA3 Þ OðX3 Þ TC T

N/A N/A

0 1 5.48 5.49

40.44 5.40 0.33 0.39 287.48 0.86 0 1 7.36 6.7

0 0 N/A 5.17

0 0 N/A 6.44

0 0 N/A 6.51

0 0 N/A 6.58

32.27 26.99 27.00 25.99 291.61 8962.91 0 1 9035.67 6.65

P1 tries to modify attributes a2 and a3 to reassign the instance to the accepted class. With respect to the mathematical formulation defined in Section 2, the model for problem P1 takes the form as Minimize TCðA; XÞ ¼ f2 ðx2 2 a2 Þ þ f3 ðx3 2 a3 Þ

ð9aÞ

Subject to : DðXÞ $ 0:5 ði:e: OðXÞ ¼ 1Þ

ð9bÞ

where f2 ðx2 2 a2 Þ and f3 ðx3 2 a3 Þ are, respectively, the cost functions for adjusting attributes a2 and a3. Due to the confidential protection, the attribute names are not given in the dataset. We assume that changing one unit of attribute value takes one unit of cost. Hence, f2 ¼ lx2 2 a2 l and f3 ¼ lx3 2 a3 l: Provided that the meaning of each attribute is given, domain experts can adequately define the cost function. A threshold value of 0.5 is used to distinguish between credit groups. Therefore, the value of discriminative function DðXÞ of the adjusted instance should be equal or greater than 0.5 ðDðXÞ $ 0:5Þ for reassigning it to accepted class ðOðXÞ ¼ 1Þ: The discriminative function can be generated from the trained BP network by using the activation functions, weights and biases. Problems P2 2 P5 can be formulated in a similar manner to P1 as shown in (9a) and (9b). The GA-based optimization algorithm is then applied to resolve above mathematical model for the inverse classification. The GA-specific parameters are set as: pop_size ¼ 100, max_gen ¼ 500, Pc ¼ 0.95, Pm ¼ 0.05 and b ¼ 2. The inverse classification results of these three instances are, respectively, summarized Table 6 Results of inverse classification for instance 2 A2 Attribute

Original

P1 ðX2 Þ

P2 ðX2 Þ

P3 ðX2 Þ

P4 ðX2 Þ

P5 ðX2 Þ

a2 a3 a8 a11 a14 a15 OðA2 Þ OðX2 Þ TC T

52.33 1.38 9.46 0.00 200.00 100.00 0 0

N/A N/A

37.06 0.63 15.84

37.34 0.77 15.99 0.80

36.16 0.04 14.98 0.51 199.53

36.04 0.58 15.00 0.93 200.97 99.76 0 1 24.77 6.7

0 0 N/A 5.68

0 1 22.4 6.37

0 1 22.93 6.48

0 1 24.01 6.54

in Tables 5 – 7. Observing from these results, it is easier to reassign the rejected instances to accepted class provided that more attributes are adjustable. However, it may acquire more adjustment cost. The adjustment costs for instances A1 and A2 are reasonable low, creditors, therefore, can suggest conditional acceptances. Applicants can evaluate the option and modify their attribute status as required. For the computational results of instance A3 ; the adjustment cost is very high, and creditors may conclude that no conditional acceptance is suggested. In addition, the required CPU time for inverse classification is reasonably short, and thus the proposed approach can be implemented in practice.

5. Conclusions This study presents a work to the credit industry that demonstrates the advantages of NNs and GAs to credit analysis. NNs have been emerged as an important and widely accepted technique for classification. Recently, a huge amount of research works in neural classification has established that NNs are a promising alternative to various traditional statistical methods. In this study, the NN-based credit scoring model is used to properly classify the applications as either accepted or rejected, and thereby to minimize the creditors’ risk and translate considerably into future savings. The GA-based inverse classification technique reassigns the rejected instances to the preferable accepted class, which balances between adjustment cost and customer preference. From the computational results of credit dataset, the proposed evolutionary computation based approach has shown enough attractive features for the computer-aided credit analysis system.

Acknowledgements This research was partially supported by the National Science Council, Taiwan, ROC under grant NSC 91-2416H-027-004.

M.-C. Chen, S.-H. Huang / Expert Systems with Applications 24 (2003) 433–441

References Brill, J. (1998). The importance of credit scoring models in improving cash flow and conditions. Business Credit, 1, 16–17. Coats, P. K., & Fant, L. F. (1993). Recognizing financial distress patterns using a neural network tool. Financial Management, Autumn, 55–142. Curram, S. P., & Mingers, J. (1994). Neural networks, decision tree induction and discriminant analysis: an empirical comparison. Journal of Operational Research Society, 45(4), 440–450. Davis, R. H., Edelman, D. B., & Gammerman, A. J. (1992). Machine learning algorithms for credit-card applications. IMA Journal of Mathematics Applied in Business and Industry, 4, 43–51. Desai, V. S., Crook, J. N., & Overstreet, G. A. (1996). A comparison of neural networks and linear scoring models in the credit union environment. European Journal of Operational Research, 95(1), 24–37. Freeman, J. A., & Skapura, D. M. (1992). Neural networks algorithm, application and programming techniques. MI, USA: Addison-Wesley. Gen, M., & Cheng, R. (1997). Genetic algorithms and engineering design. NY, USA: Wiley. Goldberg, D. (1989). Genetic algorithms in search, optimization and machine learning. Reading, MA, USA: Addison-Wesley. Henley, W. E (1995). Statistical aspects of credit scoring. Dissertation, The Open University, Milton Keynes, UK. Henley, W. E., & Hand, D. J. (1996). A k-nearest neighbor classifier for assessing consumer credit risk. Statistician, 44, 77–95.

441

Malhotra, R., & Malhotra, D. K. (2002). Differentiating between good credits and bad credits using neuro-fuzzy systems. European Journal of Operational Research, 136(2), 190–211. Mannino, M. V., & Koushik, M. (2000). The cost-minimizing inverse classification problem: a genetic algorithm approach. Decision Support Systems, 29(2), 283 –300. Markham, I. S., & Ragsdale, C. T. (1995). Combining neural networks and statistical predictions to solve the classification problem in discriminant analysis. Decision Sciences, 26(2), 229–242. Murphy, P. M., Aha, D. W (2001). UCI Repository of machine learning databases. Department of Information and Computer Science, University of California, Irvine, CA, http://www.ics.uci.edu/~mlearn/ LRepository.html. Reichert, A. K., Cho, C. C., & Wagner, G. M. (1983). An examination of the conceptual issues involved in developing credit-scoring models. Journal of Business and Economic Statistics, 1, 101– 114. Salchenberger, L. M., Cinar, E. M., & Lash, N. A. (1992). Neural networks: a new tool for predicting thrift failures. Decision Sciences, 22, 899–916. Tam, K. Y., & Kiang, M. Y. (1992). Managerial applications of the neural networks: the case of bank failure prediction. Management Science, 38(7), 926– 947. Vellido, A., Lisboa, P. J. G., & Vaughan, J. (1999). Neural network in business: a survey of applications (1992–1998). Expert Systems with Application, 17(1), 51 –70. West, D. (2000). Neural network credit scoring models. Computers and Operations Research, 27, 1131–1152. Zhang, G. P. (2000). Neural networks for classification: a survey. IEEE Transactions on Systems, Man, and Cybernetics-Part C: Applications and Reviews, 30(4), 451 –462.