Expert Systems with Applications 26 (2004) 567–573 www.elsevier.com/locate/eswa
Managing loan customers using misclassification patterns of credit scoring model Yoon Seong Kim, So Young Sohn* Department of Computer Science and Industrial Systems Engineering, Yonsei University, 134 Shinchon-dong, Seoul 120-749, South Korea Received 29 September 2003; revised 29 September 2003; accepted 28 October 2003
Abstract A number of credit scoring models have been developed to evaluate credit risk of new loan applicants and existing loan customers, respectively. This study proposes a method to manage existing customers by using misclassification patterns of credit scoring model. We divide two groups of customers, the currently good and bad credit customers, into two subgroups, respectively, according to whether their credit status is misclassified or not by the neural network model. In addition, we infer the characteristics of each subgroup and propose management strategies corresponding to each subgroup. q 2004 Elsevier Ltd. All rights reserved. Keywords: Credit scoring; Misclassification; Segmentation
1. Introduction Credit industry in Korea has expanded rapidly over last several years. Due to the intense competition of credit card issuers and banks, more and more people can have credit card and get a loan from banks without the thorough check of their credit status. This reckless expansion policy has increased the delinquency rate. According to the Financial Supervisory Service (FSS), the delinquency rate rose to the 11% in May 2003. Since the delinquency rate has increased continuously, most of major credit card companies and banks have had to set aside a large amount of money for a reserve for bad debts. Consequently, the companies and banks are suffering from cash liquidity problems and a decreased profit. Now, the lenders are trying to decrease the delinquency rate by using various types of consumer credit risk management systems (Malhotra & Malhotra, 2003). There are two types of decisions that firms who lend to consumers have to make. First, they should decide whether to grant credit to new applicants. The tools that aid this decision are called credit scoring methods. The second type of decision is how to deal with existing customers. Techniques that help * Corresponding author. Tel.: þ 82-221234014; fax: þ 82-23647807. E-mail address:
[email protected] (S.Y. Sohn). 0957-4174/$ - see front matter q 2004 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2003.10.013
with this decision are called behavioral scoring (Thomas, 2000). The objective of both credit scoring and behavioral scoring models is to assign loan customers to either a ‘good credit’ group or a ‘bad credit’ group (Lee, Chiu, Lu, & Chen, 2002). Therefore scoring problems are related to the field of classification analysis (Anderson, 1984; Hand, 1981; Johnson & Wichern, 1998; Morrison, 1990). Classification model for the credit scoring is used to categorize new applicants as either accepted or rejected with respect to their characteristics such as age, income and martial condition (Chen & Huang, 2003). In the behavioral scoring, classification model is used to predict future credit status of existing customers by using the credit scoring variables and others which describe the behavior (Thomas, 2000). Many studies have contributed to increasing the accuracy of the classification model with various kinds of statistical tools. However, most of the former studies have focused on only building more accurate credit scoring or behavioral scoring model. Even with such highly accurate scoring models some misclassification patterns appear. One can find some insight from these patterns. For instance, good customers who are classified into bad could be interpreted as a group which has potential to default. With this misclassification pattern we segment the existing loan customers into four groups. That is, the good
568
Y.S. Kim, S.Y. Sohn / Expert Systems with Applications 26 (2004) 567–573
customer group is classified into two groups: customers who are likely to delay future payments and customers who are not. The bad customer group is also classified into two groups: customers who would pay back and customers who would not. After segmentation, we infer the characteristics of each classified customer group. Then, we propose proper management strategies for each customer group according to their characteristics. This study is organized as follows. Section 2 details the methodology used in this study. Section 3 explains the empirical results of segmentation and proposes proper management strategies to each group. Finally, Section 4 concludes and summarizes the study results.
Therefore, the currently good credit and bad credit customers are respectively divided into two subgroups according to their classification results. That is, the two groups of existing customers are classified into following four groups
2. The proposed methodology
Group 1 is the soundest group among the four because the customers have not delayed payments and are predicted not to be delinquent. Thus the customers in Group 1 should be encouraged to apply for other loans. The customers in Group 2 also have not delayed payments yet, but they are expected to delay the future payments. Thus they should be controlled not to be delinquent in the future. The customers in Groups 3 and 4 are currently delinquent. The appropriate procedures for them should be conducted in order to collect the delayed payments. Among the customers in the two groups, people in Group 3 have more chance to pay back than customers in Group 4, thus they are the most profitable customers who would pay back not only their principal but also incurred interests. Therefore, mild and gentle collection procedures should be conducted to the customers in Group 3 while thorough and active collection procedures should be conducted to the customers in Group 4. In order to propose management strategies that suit the customers in each subgroup, we need to infer the characteristics of the groups. In inferring the characteristics of those, we use some input variables turned out to be significantly different among the groups.
In this study, the classification model for credit scoring is used as a tool of segmentation for the existing loan customers. For credit scoring analysis, many studies have reported that neural networks (NNs) perform significantly better than other statistical techniques such as linear discriminant analysis (LDA), multiple discriminant analysis (MDA), logistic regression analysis (LRA) and so on (Desai, Crook, & Overstreet, 1996; Lacher, Coats, Sharma, & Fant, 1995; Malhotra & Malhotra, 2003; Sharda & Wilson, 1996; West, 2000; Zhang, Hu, Patuwo, & Indro, 1999). Accordingly, we use NNs as the classification model. The NNs are built with the data of the existing customers, which include variables from the application form. Then, all of the existing customers whose data are used to build the classification model are evaluated by the model in order to detect their predicted credit status, good or bad. Because we have to use the data of current loan customers for both training and validation, we apply the cross-validation methodology. To implement 10-fold crossvalidation, we divide the data sample into 10 mutually exclusive sub-samples. Then, we build a classification model with nine sub-samples and validate the models with the rest one sub-sample. This process is repeated 10 times with a different validation sub-sample and the remaining nine sub-samples as the training data (Malhotra & Malhotra, 2003). Therefore, the validation results of the customers become their predicted credit status. Most of the good customers and bad customers would be validated to be in good credit status and bad credit status, respectively, by the scoring model. However, some good customers would be evaluated to be bad and some bad customers would be evaluated to be good as well. That is, some customers would be misclassified by the model. The good credit customers who are misclassified by the model would have more chance to delay future payments than other good credit customers. Also, the bad credit customers who are misclassified would have more chance to repay the delayed payments than the other bad credit customers.
Group 1: customers who have not delayed and are not likely to delay future payments; Group 2: customers who have not delayed but are likely to delay future payments; Group 3: customers who are currently delinquent but would pay back eventually; and Group 4: customers who are currently delinquent and would not pay back.
3. Empirical analysis 3.1. Data The data used for this study is ‘German credit’ data obtained from UCI Repository of Machine Learning Databases (http://www.niaad.liacc.up.pt/statlog/datasets. html). The data have been mostly used to compare performance of various classification tools (Lee, Chiu, Lu, & Chen, 2002; Lee & Huh, 2002; Paredes & Vidal, 2000; West, 2000). However, we analyze the characteristics of people to propose proper management strategies to those people according to their predicted credit status. The data consist of a set of loans given to a total of 1000 applicants. The applicants are divided into two groups: those who were accepted and maintain good credit and those
Y.S. Kim, S.Y. Sohn / Expert Systems with Applications 26 (2004) 567–573
569
Table 1 Classification of loan customers using neural network Actual group
Accuracy of predicted group membership (%) Sample 1
Sample 2
Sample 3
Sample 4
Sample 5
Sample 6
Sample 7
Sample 8
Sample 9
Sample 10
Training set Good loan Bad loan Overall
83.2 64.4 77.7
78.8 71.9 76.7
65.5 82.7 70.7
71.7 74.2 72.4
72.9 70.0 71.9
81.3 70.0 77.8
85.8 64.5 79.2
80.7 65.2 76.1
74.9 71.9 74.0
85.1 80.3 83.7
Validation set Good loan Bad loan Overall
78.1 69.4 75.0
79.7 80.8 80.0
69.0 69.0 69.0
71.9 86.1 77.0
90.0 70.0 84.0
74.3 70.0 73.0
84.8 76.2 83.0
85.1 72.7 81.0
80.6 63.6 75.0
85.1 65.4 80.0
who were accepted, but became delinquent. This credit status of applicants is used as a response variable for behavioral scoring model. The dataset contains 20 input variables such as the loan customer’s age, credit amount, credit history, employment, housing and so on. We eliminated nine input variables which turn out to have insignificant relation to the target variable through variable selection using chi-square test. As a result, we used eleven input variables to construct neural networks. The input variables used in neural networks are as follows: credit amount, credit history, duration in month, other debtors and guarantors, other installment plans, present employment, present residence, property, purpose, savings accounts and bonds, and status of existing checking account. 3.2. Results of classification Table 1 summarizes the results of the neural network model. The network was trained on 1000 customers, 700 good credit and 300 bad credit customers, for 10 different sub-samples of the data for varying cycles, varying hidden neurons and nodes. Optimum number of hidden neurons and nodes and training cycle were decided through trial and error approach for each cross-validation data. After training, the network classified 66– 86% of the good loan customers and 64– 83% of the bad loan customers accurately, with an overall classification accuracy of 71 – 84%. To generalize the accuracy of the classifiers, 10% of the customers in each sample are used to validate each trained neural network. The network classified 69 –90% of good loan customers and 64 –86% of the bad loan customers accurately, with an overall classification accuracy of 69 –84%. 3.3. Results of segmentation We segmented the existing customers into four groups according to the validation result of each customer. Table 2 displays the result of the segmentation of the existing customers. Among 700 currently good credit customers, 570 customers were evaluated to be in good credit status
(Group 1) while 130 customers were evaluated to be in bad credit status (Group 2). Similarly, among 300 currently bad credit customers, 90 customers were evaluated to be in good credit status (Group 3) while 210 customers were evaluated to be in bad credit status (Group 4). After segmentation of the existing customers, we inferred the characteristics of each group to propose appropriate management strategies. In order to infer the characteristics of the groups, we compared the input variables of Group 1 with those of Group 2, and the same analysis for Group 3 with Group 4. We used chi-square test for whole input variables to determine if there are significant differences between the input variables of the two groups. Two interval input variables, duration in month and credit amount, were changed into ordinal variables to apply chisquare test. As a result, we could find the characteristics discriminate Group 1 from Group 2, and those for Group 3 from Group 4. The results of chi-square test are illustrated in Table 3. For most of the input variables, there were significant differences between Groups 1 and 2 at 1% level. Also, between Groups 3 and 4, most input variables were significantly different at 5% level. Among the input variables proved to be significantly different, we concentrated on those, which have high chi-square values in inferring the principal characteristics, which have influence on credit status. The differences between Groups 1 and 2 are illustrated in Fig. 1, and the differences between Groups 3 and 4 are illustrated in Fig. 2. As seen in Fig. 1, there were most significant differences between Groups 1 and 2 in status of checking account, Table 2 Segmentation results of loan customers Actual class
Good credit Bad credit Average correct classification rate
Classified class Good credit
Bad credit
570 90 78.0%
130 210
570
Y.S. Kim, S.Y. Sohn / Expert Systems with Applications 26 (2004) 567–573
Table 3 The results of chi-square test for input variables of Groups 1–4 Input variable
Chi square
P-value
Groups 1 and 2 Credit amount Credit history Duration in month Other debtors and guarantors Other installment plans Present employment Present residence Property Purpose Status of checking account Status of savings account
45.1309 64.3595 65.7950 5.8127 27.3302 39.9206 16.4709 30.0046 24.3521 109.0647 13.9128
,0.0001 ,0.0001 ,0.0001 0.0547 ,0.0001 ,0.0001 0.0009 ,0.0001 0.0038 ,0.0001 0.0076
Groups 3 and 4 Credit amount Credit history Duration in month Other debtors and guarantors Other installment plans Present employment Present residence Property Purpose Status of checking account Status of savings account
10.5553 27.1635 39.8791 0.5248 1.2326 2.5387 5.1957 15.7281 7.1217 69.8779 14.1298
0.0320 ,0.0001 ,0.0001 0.7692 0.5399 0.6377 0.1580 0.0013 0.6244 ,0.0001 0.0069
duration in month, credit history, credit amount, present employment, and property. In status of checking account, about 60% of the customers in Group 1 do not have checking account while over 80% of the customers in Group 2 have checking account whose balance is below 200 DM or even below zero. It is an unexpected result that customers who do not have checking account are predicted to be in good credit status. Their loan application might be accepted because they satisfied other requisites even though they did not have any checking account. In duration in month, the duration of about 90% of the customers in Group 1 does not exceed 24 months while Group 2 has about 80% of the customers whose duration is more than 24 months. In credit history, most of the existing customers paid back duly till now for existing credits. A noticeable thing is that about 77% of the customers who have critical accounts or credits at other banks are in Group 1. Those people are generally in bad credit status. However, in this experiment, they are evaluated not to delay payments. Credit amount of Group 1 is smaller than that of Group 2. Credit amount of over 80% of the customers in Group 1 does not exceed 4000 DM while about 75% of the customers whose credit amount is over 6000 DM are in Group 2. In present employment, about 70% of the customers employed for less than 1 year are in Group 2 while the customers in Group 1 cover about 65% of the customers employed for more than 4 years. Finally, in property, the customers who have real estate in Group 1 are about two times as large as those in
Group 2. However, the customers who do not have any property in Group 2 are about two times as large as those in Group 1. As illustrated in Fig. 2, the differences of characteristics between Groups 3 and 4 are almost same as those between Groups 1 and 2. Besides, there are significant differences between Groups 3 and 4 in status of savings account. About 77% of the customers who have savings account whose balance is more than 500 DM are included in Group 3 while about 80% of the customers in Group 4 have savings account whose balance is below than 100 DM. As seen in Figs. 1 and 2, Groups 1 and 3 are similar, and Groups 2 and 4 are also similar in the pattern of their characteristics. It is because the two pairs are, respectively, classified into same category by neural network according to their characteristics. One can easily detect what input variables have significant effect on the target variable through decision tree or logistic regression model. However, through neural network, one cannot find what characteristics of customers influence their classification result, good or bad credit. We could examine the decisive characteristics in classifying the customers by applying chi-square test for the classified groups of customers even though we used neural network. 3.4. Suggestion of management strategies With the inferred characteristics of the four groups, we propose appropriate management strategies for each group. Firstly, duration of the loans of the customers in Group 1 is shorter and their credit amount is smaller than other good credit customers’. Also, such customers have property and critical account or credits at other banks. In addition, the customers have been employed for a long time. From these patterns we can assume that their credit status is good enough to be guaranteed and they have enough money for investment. Therefore, loan companies should encourage these customers to invest their surplus money to make a continuous profit and The duration of Group 2 is longer and credit amount is larger than each of Group 1. Also, most of the customers in Group 2 have not worked for a long time and do not have property. Thus we can assume that the customers in Group 2 are financially unstable and accordingly have more chance to default on loan obligations than the customers in Group 1. Thus they might run into chronic delinquency if they once delay the payments. Therefore, lenders should keep observing their individual information, especially the balance of their checking account. If there is any sign of defaulting, lenders should give a notice to remind their obligations. The customers in Group 3 are similar with those in Group 1. Their savings account and checking account are not in bad status, and they have property such as real estate like the customers in Group 1. From this we can assume that they are capable of fulfilling loan obligations but might forget to make a payment. Thus lenders should give the customers
Y.S. Kim, S.Y. Sohn / Expert Systems with Applications 26 (2004) 567–573
Fig. 1. The differences between Group 1 and Group 2.
571
572
Y.S. Kim, S.Y. Sohn / Expert Systems with Applications 26 (2004) 567–573
Fig. 2. The differences between Group 3 and Group 4.
Y.S. Kim, S.Y. Sohn / Expert Systems with Applications 26 (2004) 567–573
a friendly reminder that the payment is past due. In addition, if the customers are suffering from temporary shortage of money, lenders can accept partial payments and extend the duration of their loans to reduce burden of the customers on repaying delayed payments. It would not cause financial loss because the customers’ duration is not long and credit amount is small. Finally, for the customers in Group 4, more serious and thorough collection approach is required because they are predicted to keep defaulting on their loan obligations. The customers in Group 4 have no property and checking account or savings account in bad status. Also, duration of their loans is long and credit amount is large. Thus lenders should set an effective strategy to minimize predicted loss of the delinquent loans and keep the customers from long term delinquency. Taking into consideration the loan customers’ individual information, the lenders can approve workout plans of the delinquent customers who are likely to make money. However, for the customers predicted not to make enough money, lenders can pledge funds on deposit, charge off their delinquent loans, and repossess their collaterals.
4. Conclusions and areas of future research Credit and behavioral scoring have become popular tools to predict the financial risk of loan customers and to help loan companies to deal with the customers. However, most studies have concentrated on building an accurate credit scoring model to decide whether or not to grant credit to new applicants. In order to strengthen credit risk management system for existing loan customers we created a credit scoring model using neural networks. Then, we segmented the existing customers by using misclassification pattern of the credit scoring model. The existing customers were divided into four subgroups according to their current credit status and classification results. We inferred the characteristics of customers in each group and proposed management strategies appropriate to the characteristics of the groups. Further research may aim at time-series behavioral scoring models that include the change of credit status in every period. By using the proposed methodology with time-series behavioral scoring model, loan customers can be segmented into more subgroups due to new variables, which are the number of defaults, whether repaid after default or
573
not, and so on. Also, more detailed management strategies for the customers in the subgroups can be proposed.
Acknowledgements This work was supported by grant No. (R04-2002-00020003-0) from the Basic Research Program of the Korea Science & Engineering Foundation.
References Anderson, T. W. (1984). An introduction to multivariate statistical analysis. New York: Wiley. Chen, M. C., & Huang, S. H. (2003). Credit scoring and rejected instances reassigning through evolutionary computation techniques. Expert Systems with Applications, 24, 433–441. Desai, V., Crook, J., & Overstreet, G. (1996). A comparison of neural networks and linear scoring models in credit union environment. European Journal of Operations Management, 95(1), 24 –37. Hand, D. J. (1981). Discrimination and classification. New York: Wiley. Johnson, R. A., & Wichern, D. W. (1998). Applied multivariate statistical analysis (Fourth Edition). Upper Saddle River, NJ: Prentice-Hall. Lacher, R. C., Coats, P. K., Sharma, S. C., & Fant, L. F. (1995). A neural network for classifying the financial health of a firm. European Journal of Operations Research, 85, 53– 65. Lee, S., & Huh, M. Y. (2002). A measure of association for complex data. Computational Statistics and Data Analysis, 44, 211–222. Lee, T. S., Chiu, C. C., Lu, C. J., & Chen, I. F. (2002). Credit scoring using the hybrid neural discriminant technique. Expert Systems with Applications, 23, 245–254. Malhotra, R., & Malhotra, D. K. (2003). Evaluating consumer loans using neural networks. Omega, 31, 83–96. Morrison, D. F. (1990). Multivariate statistical methods. New York, NY: McGraw-Hill. Paredes, R., & Vidal, E. (2000). A class-dependent weighted dissimilarity measure for nearest neighbor classification problems. Pattern Recognition Letters, 21(12), 1027– 1036. Sharda, R., & Wilson, R. L. (1996). Neural network experiments in business-failure forecasting: predictive performance measurement issues. International Journal of Computational Intelligence and Organizations, 1(2), 107–117. Thomas, L. C. (2000). A survey of credit and behavioural scoring: forecasting financial risk of lending to consumers. International Journal of Forecasting, 16, 149 –172. West, D. (2000). Neural network credit scoring models. Computers and Operations Research, 27, 1131–1152. Zhang, G., Hu, M., Patuwo, B., & Indro, D. (1999). Artificial neural networks in bankruptcy prediction: general framework and cross-validation analysis. European Journal of Operations Research, 116, 16 –32.