Analyzing credit risk among Chinese P2P-lending businesses by integrating text-related soft information

Analyzing credit risk among Chinese P2P-lending businesses by integrating text-related soft information

Journal Pre-proofs Analyzing credit risk among chinese P2P-lending businesses by integrating textrelated soft information Kun Liang, Jun He PII: DOI: ...

1MB Sizes 0 Downloads 48 Views

Journal Pre-proofs Analyzing credit risk among chinese P2P-lending businesses by integrating textrelated soft information Kun Liang, Jun He PII: DOI: Reference:

S1567-4223(20)30024-7 https://doi.org/10.1016/j.elerap.2020.100947 ELERAP 100947

To appear in:

Electronic Commerce Research and Applications

Received Date: Revised Date: Accepted Date:

21 August 2018 16 December 2019 4 February 2020

Please cite this article as: K. Liang, J. He, Analyzing credit risk among chinese P2P-lending businesses by integrating text-related soft information, Electronic Commerce Research and Applications (2020), doi: https://doi.org/10.1016/ j.elerap.2020.100947

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

© 2020 Elsevier B.V. All rights reserved.

Analyzing Credit Risk Among Chinese P2P-Lending Businesses by Integrating Text-Related Soft Information

Kun Liang (Corresponding author) Affiliations:Anhui University Postal addresses:111 Jiulong Road, Hefei, Anhui, China (230601) Telephone numbers:+86-15005606765 Email addresses: [email protected]

Jun He Affiliations:Anhui University Postal addresses:111 Jiulong Road, Hefei, Anhui, China (230601)

Abstract: Text-related soft information effectively alleviates the information asymmetry associated with P2P lending and reduces credit risk. Most existing studies use nonsemantic text information to construct credit evaluation models and predict the borrower's level of risk. However, the semantic information also reflect the ability and willingness of borrowers to repay and might be able to explain borrowers’ credit statuses. This paper examines whether semantic loan description text information helps predict the credit risk of different types of borrowers using a Chinese P2P platform. We use the 5P credit evaluation theory and the word embedding model to extract the semantic features of loan descriptions across five dimensions. Then, the AdaBoost ensemble learning strategy is applied to construct a credit evaluation model to improve the learning performance of an intelligent algorithm. The extracted semantic features are integrated into the evaluation model to study their explanatory ability with regard to the credit status of different types of borrowers. We conducted empirical research on the Renrendai P2P platform. Our conclusions show that the semantic features of textual soft information significantly improve the predictability of credit evaluation models and that the promotion effect is most significant for first-time borrowers. This paper has important practical significance for P2P platforms and the credit risk management of lenders. Furthermore, it has theoretical value for research concerning heterogeneous information-based credit risk analysis methods in big data environments. Keywords: Credit risk, Chinese P2P, Soft information, 5P analysis, Word embedding model

1 Introduction Credit risk analyses mitigate the information asymmetry among traders on P2P platforms and reduce loss given default (LGD; Van and Aerts, 2015). Traditionally, borrowers’ credit

risk is reflected by their financial information (i.e., “hard information”). More recently, researchers have examined the role that nonfinancial information (i.e., “soft information”) plays in predicting credit risk. These studies have shown that borrowers’ soft information displays their creditworthiness, at least to a certain extent. Especially in instances such as P2P lending platforms, hard information about individuals or micro and small enterprises (MSEs) were difficult to obtain and verify. Therefore, soft information is an effective supplement to hard information in credit risk analyses. The most common soft information includes borrowers’ demographic data, online behavior, and social capital (Pötzsch and Böhme, 2010; Bachmann et al., 2011; Ge et al., 2016). Recently, descriptions of text-related soft information have been used to predict the probability of default on granted loans in P2P platforms. In P2P-lending businesses, each loan applicant is encouraged to submit a descriptive text that discloses the purpose of the loan and other personal information. Iyer et al. (2016) revealed the predictive power of the characteristics of this text concerning the probability of default. Dorfleitner et al. (2016) examined the relationship between the soft factors derived from descriptive texts and the probabilities of successful funding and default in P2P lending among two leading European platforms. Jiang et al. (2018) introduced a topic model to extract valuable features from the descriptive text concerning loans and constructed four models to demonstrate the performance of these features for the prediction of default. Existing studies have primarily focused on the language elements of descriptive texts in credit risk analyses, such as its readability, length and sentiment. However, little research has explored the effect of semantic features on creditworthiness in the P2P environment. In fact,

the descriptive texts provided by applicants usually show information such as their fund use, repayment method, family situation, personality and morality. The semantic features of descriptive texts effectively reflect the ability and willingness to repay of borrowers, both of which are beneficial to analyze the credit risk of loans. In addition, despite the lack of quantitative research, semantic information has been widely used in qualitative analyses of borrowers' credit risk by traditional financial institutions such as banks. For example, bank loan departments use the 5P credit evaluation theory to manually read the semantic information of the loan description text and make loan decisions. Therefore, in the context of P2P lending, the use of quantitative research methods such as the word embedding model and the credit evaluation model is meaningful to study the influence of semantic features (e.g., 5P features) on borrowers’ credit risk. To the best of our knowledge, Jiang et al. (2018) and Wang and Lin (2015) are the only two studies on the effect of semantic features on P2P credit risk. Jiang et al. (2018) extracted the semantic features of loan description text through a topic model, which is a coarse-grained semantic analysis model that does not work well with brief texts. Wang and Lin (2015) extracted the keywords pertaining to morality in loan descriptions through artificial statistics and researched the relationship between the number of these keywords and the probability of default. This artificial statistics method is expensive and cannot adapt to the big data environment. Furthermore, the current study usually ignores the predictive power of the text feature, which varies across people. For example, we cannot obtain the credit histories of newly registered applicants on a P2P platform; in this case, the text feature might provide some reference value for credit risk evaluation. For frequent users, however, the text feature might have a lower predictive ability regarding creditworthiness

because of the existence of richer hard information. Additionally, most researchers have explored the effect of text features on the prediction of default in an English-language environment. In fact, China’s economy has grown; consequently, it is more influential on the world stage. Recently, defaults are becoming common on Chinese P2P-lending platforms. Although much research has examined China's credit evaluation system, few scholars have attended to the importance of soft information with regard to credit evaluation and risk management from the perspective of loan description text. Soft information has important theoretical and practical value for borrowers, P2P platforms and lenders. From the point of view of borrowers, creditworthiness depends primarily on their ability and willingness to repay. Existing research primarily reflects borrower's repayment ability through hard information. Dorfleitner et al. (2016) showed that soft information influences and reflects borrowers' financing activities and repayment behaviors, as well as directly or indirectly affects their repayment willingness and creditworthiness. However, how to extract relevant credit risk features from the soft information of loan description text and construct a suitable credit evaluation model remains to be studied. At the same time, soft information is a signal of credit risk for P2P platforms and lenders (Iyer et al., 2016). How to judge the creditworthiness of different types of borrowers based on risk signals such as soft and hard information is crucial to maintain the healthy operation of P2P platforms and protect the interests of investors. The aim of this paper was to fill the aforementioned research gaps. We focused on the Renrendai website, one of the largest P2P platforms in China. We extracted the fine-grained semantic features of loan description text using the word embedding model. Compared with

topic-level semantic features, the word embedding model extracts word-level semantic features to more precisely reflect the semantic information concerning credit risk. We categorize these word-level semantic features into five groups based on 5P theory, which reflects the credit risk from five aspects: personal, purpose, payment, promise and prospect. Finally, we combine soft information (text language features and semantic features) and hard information to explore the effect of these information on credit risk, which varies among different kinds of loan applicants. We aimed to ascertain the change rule of the credit risk prediction ability of the text feature. The contributions of our work can be summarized as follows: (1) It reveals the importance

and effect of textual soft information for evaluating the credit risk of different types of borrowers in the Chinese P2P market; (2) according to the idea of combining qualitative and quantitative analysis (in this case, combining 5P credit evaluation and the word embedding model), we propose a credit feature extraction method based on the semantics of loan description text; and (3) based on signal theory, we clarify the reasons for the different effects between soft and hard signals across different scenarios, construct a credit evaluation model for specific types of borrowers, and expand the relevant theories and research on P2P credit risk. The results of this paper might help different types of borrowers understand the factors that can affect their financing performance as well as help investors and P2P platforms to enhance the credit risk recognition ability on different types of borrowers to better grasp investment opportunities and control risk. The remainder of this paper is organized as follows. In Section 2, we review the literature on credit evaluation in P2P lending. Section 3 provides the research framework of

this paper. The data and research model are introduced in Section 4. Section 5 provides the results and discussion. Finally, we present the study’s conclusions and implications in Section 6.

2 Related studies This section summarizes the relevant theories and methods of credit evaluation. We first discuss signal theory and how to apply it to the P2P-lending environment; then, we explore the credit evaluation indicators that can be used as signals, as well as the relationship between these indicators and credit risk; finally, we collate the relevant research on credit evaluation models. Combing through the related studies provides a basis for the selection of the research methods and models. In addition, we summarize the shortcomings of the existing research, and put forward the research content of this paper. 2.1 Signal theory Signal theory was first proposed by Spence in 1973 (Spence, 1973). The basic idea of this theory concerns how to trade and gain benefits by transmitting or discriminating quality signals in the case of information asymmetry. One of the core issues of P2P lending is information asymmetry (Burtch et al., 2014). Compared with investors, borrowers clearly understand their own credit quality. Therefore, the borrower transmits relevant soft and hard information as signals to the investor, helping the investor to make investment decisions in the case of information asymmetry to facilitate the transaction. For example, the borrower's income level is used as a signal for the lender to judge repayment ability and credit risk (Abdou and Pointon, 2011). In this example, the borrower is the sender of the signal, and the investor is the recipient of the signal. By sending

a signal of income level, the borrower allows the investor to effectively identify their credit status, thereby alleviating the information asymmetry between the two and facilitate a lending transaction. Investors can identify high-quality investment objects by observing multiple signals from borrowers to obtain investment income (Perkins and Hendry, 2005). Signal cost is the core of signal theory (Bird and Smith, 2005). Signals with high production costs are more effective in mitigating information asymmetry (Aggarwal et al., 2012). Rational borrowers do not have to generate costly signals for small profits, especially when the cost of generating signals is higher than the benefits of the signal. For example, in P2P lending, level of education might be considered as a costly signal because it takes a long time to learn and accumulate. Studies have shown that level of education is significantly correlated with the success rate of borrowing and the default rate (Bhatt and Tang, 2002). Based on the existing literature, this paper further studies and explores the effects of different cost-generating signals (e.g., textual soft and hard signals) on predicting the credit risk of different types of borrowers (i.e., new borrowers and repeat borrowers). We also use signal theory to analyze and interpret the results obtained. Our research provides empirical evidence that some of the costlier signals have a greater effect on lenders' decisions. In P2P lending, expensive signals include those that are costly to produce (e.g., credit rating systems) and difficult to obtain (e.g., borrowers with good repayment history). When repeat borrowers have more costly and effective hard information signals to endorse, the explanatory ability of soft information signals becomes relatively limited. Therefore, compared with predicting the credit risk of new borrowers, text-related soft information is less effective in predicting the creditworthiness of repeat borrowers.

2.2 Credit evaluation indicator In this section, we discuss which textual features can be used as signals of credit risk and how these signals affect a borrower's creditworthiness. At the same time, we analyze the shortcomings of the existing research, as well as discuss how to use relevant theories and new feature extraction methods to obtain more effective signals and improve the performance of the credit evaluation model. Credit risk can be measured from the perspective of borrowers’ ability and willingness to repay. Repayment ability can be reflected by hard information (e.g., financial and asset status), whereas repayment willingness can be reflected by soft information (e.g., demographic data, online behavior, and social capital). Recently, text-related soft information has aroused the concern of scholars and has been used as signals of credit risk in P2P lending. Borrowers tend to provide textual information if it increases the probability of funding. Thus, more text might be a signal of creditworthiness and should result in a lower risk of default (Dorfleitner et al., 2016). Iyer et al. (2016) found that peer lenders rely on text language features as signals in their screening borrower process and that lenders can predict an individual’s likelihood of defaulting on a loan with 45% greater accuracy than by using only the borrower’s exact credit score. Dorfleitner et al. (2016) revealed that successfully funded loans exhibit a more sophisticated diction, and orthography serves as a signal of creditworthiness and a proxy for education. A negative relationship exists between borrowers’ level of education and their probability of default (Bhatt and Tang, 2002). Gao and Lin (2015) found that the readability, sentiment, objectivity and deception of a loan’s descriptive text was significantly related to the probability of default on Prosper.

The above studies construct a credit evaluation model by considering the language information of the text but fail to reveal the effect of semantic information on credit risk within the P2P environment. Borrowers are required by P2P platforms to disclose semantic information regarding their fund use, repayment plan, income level, and personal and family information. These semantic information can also be used as a signal to help investors perceive and evaluate the credit risk of borrowers. Wang and Lin (2015) indicated that the more words that show personality in a loan description, the easier it is for borrowers to obtain a loan and the lower the rate of default. In that study, however, the number of words related to personality was analyzed via manual reading; the efficiency was low, and the sample size was limited. A need exists for further study concerning how to automatically analyze the semantic information of loan description text through a relevant semantic modeling method and extract appropriate semantic features as a signal of creditworthiness in a P2P environment. Common semantic modeling methods include the rule-based method, the training-based method, and the topic model method. Based on the rules, the semantic features are extracted by establishing grammar rules, syntax rules or domain ontology (Ali et al., 2016; Rao et al., 2015; Wang et al., 2008). Based on the training method, semantics were automatically extracted using the relevant methods of statistical learning and machine learning (Wicaksono and Myaeng, 2013; Gogar et al., 2016). Overall, the rule-based method and the training-based method have high labor costs when establishing rules or label-training samples. The latent Dirichlet allocation (LDA) topic model method effectively extracts topic information from text and has been widely used in risk assessment, business intelligence and decision support (Wang and Xu, 2018). Jiang et al. (2018) extracted semantic signals from loan description text

using the LDA topic model and constructed four default prediction models to demonstrate the performance of these feature signals for the prediction of default. For brief text on online platforms, however, the semantic modeling ability of the LDA model is weak (Yan et al., 2013). In addition, the LDA model can only obtain the topic-level semantic information; it is unable to further analyze the semantic meaning of a word (Kou et al., 2018). Recently, semantic mining based on the word embedding model became a trend in the field of natural language processing (Young et al., 2018); however, it has not been used to study the extraction of semantic signals from loan description text. The word embedding model is trained to map each word into a k-dimensional real-value vector. The semantic similarity between words is determined according to the distance between their vectors (Basirat and Nivre, 2017). The word vector contains rich semantic information, is more accurate with regard to word expression, and can excavate the semantic relationship among words in a variety of ways (Wang et al., 2017). However, which kinds of semantic signals are more effective in credit evaluation still need theoretic argumentation. In summary, the existing literature has rarely investigated the ability of semantic signals to explain personal creditworthiness. This paper combines the 5P credit evaluation method and the word embedding model to extract the risk signal from the semantic information of loan description text for credit evaluation. As such, the ability of semantic signals to explain the credit risk of different types of borrowers on a P2P platform was studied. 2.3 Credit evaluation model The credit evaluation model integrates multiple soft and hard signals to comprehensively evaluate the credit risk of borrowers. In this section, we provide a basis for the selection of

related methods and models by combing the relevant research regarding the credit evaluation model. The existing credit evaluation models primarily include three categories: statistical evaluation models, intelligent evaluation models and combined evaluation models. The methods used in the statistical credit evaluation model usually include discriminant analysis, logistic regression and decision tree. Sowers and Durand (1942) first applied Fisher’s linear discriminant analysis to the field of identifying nonperforming loans. Since then, Eisenbeis (1978) applied and promoted this method in the field of credit scoring. Wiginton (1980) applied a logistic regression model to the field of credit scoring. Later, Cramer (2004) developed several variants of logistic regression. The experimental results showed that the classification accuracy of the boundary logistic regression method was higher. Lee et al. (2006) conducted in-depth research on the application of a decision tree model in the field of credit evaluation and achieved satisfactory results. This researcher noted that the decision tree model involves the automatic selection of features, a strong ability to process missing data, and accurate analytical conclusions. The advantages are higher, but the stability of the model is weaker (Lee et al., 2006). Overall, although the statistical credit evaluation model performs well with regard to certain application scenarios and has adequate interpretability, it requires the raw data to conform to strict statistical assumptions and is usually effective only when the sample size is large. The intelligent evaluation model effectively compensates for the inadequacies of the statistical evaluation model. Commonly used intelligent evaluation models include neural networks, support vector machines and genetic algorithms. Desai et al. (1996), West (2000), Malhotra R and Malhotra D.K (2003), and Hájek (2011) showed that the neural network (NN)

model has a significant advantage when a complex nonlinear relationship is presented between credit features. Baesens et al. (2003) took the lead in applying the support vector machine model to the field of credit evaluation. Their results showed that the support vector machine-based credit evaluation method was significantly better than linear regression and neural network-based credit evaluation methods. Although the support vector machine method has shown satisfactory application results in credit evaluation, problems remain. For example, choosing the kernel function of the support vector machine and setting the relevant parameters of the model still depend on the knowledge and experience of experts. Davis and Albright (2004) applied genetic algorithms in the development of personal credit scorecards. The combined evaluation model has recently become a major trend in the field of credit evaluation. It can effectively promote the complementary advantages of multiple models and overcome the defects of a single model. At present, three ways exist to construct a combined credit evaluation model. The first is to serially combine multiple models. Lee et al. (2002) proposed a combined evaluation model of the “two-stage hybrid neural network discriminant method”, which uses the results of the discriminant analysis method together with other credit features as input units to construct a neural network model. The results showed that this combined evaluation model significantly shortens the training time of the neural network and effectively improves the classification accuracy of the credit evaluation model. The second way to build a combined model is to combine the outputs of multiple credit evaluation models in parallel. Sun and Li (2008) used weighted voting to combine the results of multiple discriminant, logistic regression, support vector machine, neural network, decision tree and nearest neighbor models. The results showed that the combined evaluation model

significantly improved on the overall classification accuracy and robustness. The third combined model is constructed by generating multiple training sets through integrated algorithms such as bagging or boosting as well as selecting unstable classification algorithms to build models on these training sets. Finally, the model analysis results are combined. Finlay (2011) established a variety of integrated personal credit scoring models based on bagging and boosting and compared their application effects with traditional single models. The results showed that the integrated model is significantly better than the single model. In summary, the existing research shows the advantages and disadvantages of different types of models. The performance of the combined model is significantly better than the traditional single model, which provides a basis for us to select appropriate research models.

3 Research framework The research framework shown in Fig. 1 reflects the overall thinking of this thesis. This framework consists of two parts: One is shown above the dashed line (i.e., the analysis of the loan description text, including nonsemantic analysis and semantic analysis), and the other is shown under the dashed line (i.e., the credit evaluation model’s construction and comparison). 5P method

Seed words

Loan description text

Word embedding model

Word groups

Non-semantic features

Borrowers

Structured features

Model 1

Word frequency statistics

Semantic features

Model 2

Model 3

Model comparison

Robustness test

Fig. 1. Research framework

3.1 Loan description text analysis In analyzing the loan description text, we extracted both nonsemantic and semantic features regarding credit risk evaluation. The nonsemantic features include the number of words, number of sentences, sentiment inclination and text readability. These features effectively reflect both the loan applicants’ attitudes toward borrowing and their writing styles, which previous studies have shown are valuable for credit risk analysis. In addition, we extracted the semantic features of the loan description text according to 5P theory, which reflects the credit risk with regard to five aspects: personal, purpose, payment, promise and prospect. The 5P credit rating is a method of evaluating credit applications developed by the Federal Reserve Centre (Abbadi and Suleiman, 2013). The 5P system is commonly used to establish a credit rating and provide the basis for loan approval. The 5P model, also called “Expert Systems”, is an important qualitative research method for credit ratings. This model reclassifies 5C elements but lacks a quantitative analysis (Standifird, 2001). Abbadi and Suleiman (2013) attempted to determine the methods that the banks operating in Palestine use to evaluate customers’ applications for credit using the 5P, 5C and FAPE methods, as well as which element in each method they concentrated on the most. The results showed that banks concentrated most on personal and payment issues using the 5P analysis. Specifically, we first selected three seed words for each aspect of 5P theory. For example, regarding the aspect of personal information, we selected “honesty”, “integrity” and “friendliness” as seed words to reflect the personal qualities of the borrower. Next, we measured the semantic similarity between seed words and other words in the loan description text using the word embedding model. When a word’s semantic similarity to a seed word

exceeded the set threshold, we combined both words into a single group. Finally, we calculated the total frequency of the words in the group as a semantic feature to reflect the semantic intensity of each aspect of 5P theory. We extracted five clear semantic features corresponding to five groups of words that reflected five aspects of credit risk based on 5P theory. Semantic similarity computing is the key to extracting semantic features. Using the skip-gram word embedding model shown in Fig. 2, we calculated the semantic similarity between the loan description words and the seed words. The skip-gram is a three-layer neural network model that includes an input layer, a hidden layer and an output layer. The basic idea is to map each word into a k-dimensional real vector with semantic and grammatical information (in general, the k ranges from 50 to 200). The semantic similarity between words is determined using the distance between the vectors (e.g., Euclidean distances and cosine similarities). The skip-gram word embedding model was used to obtain the expression of words distributed in the vector space while modeling. INPUT

W(t)

PROJECTION

OUTPUT

W(t+b)

……

W(t+1)

W(t-1)

……

W(t-b)

Fig. 2. Skip-gram word embedding model

Suppose a set of word sequences exists in the corpus such as W1, W2..., and Wt; the skip-gram model aims to predict context words through the current word Wt. The objective function of the model is as follows:

F

1 T   log p(wt i wt ) T t 1 bi b,i 0

Where, b is the constant that determines the size of the context window. Larger values of b denote longer training time and greater accuracy. Simultaneously, we chose the hierarchical softmax method to train the Skip-gram model. 3.2 Hypothesis development The 5P theory was employed to evaluate the credit risk of borrowers based on five aspects: personal factors, purpose factors, payment factors, promise factors and prospect factors. Personal factors reflect the borrower's honesty, friendliness and integrity. Previous studies have suggested that the personal factors of economic actors are a type of diagnostic and qualitative information that affects economic behavior outcomes (Martens et al., 2007; Michels, 2012; Wang and Lin, 2015). The purpose of the loan is an important basis for the lender to perceive the borrower's credit risk and make investment decisions. From the perspective of information asymmetry theory, personal factors and purpose factors act as signals to mitigate the adverse selection before the loan transaction and the moral hazard after the transaction, respectively, which might affect the credit of the borrower (Wang and Lin, 2015; Chao et al., 2014). The payment factor reflects the borrower's ability to repay, which is an important measure of credit risk (Peng and Ye, 2011). The borrower can disclose his or her income and assets through the loan description text to reflect his or her ability to repay. The P2P platform also reflects the borrower's payment factors by disclosing information such as repayment history as a signal of credit risk. The promise factor is a signal of commitment and assurance

provided by the borrower or a third party that might enhance the trust and endorsement of the borrower (Mukherjee and Nath, 2003). Chen et al. (2016) studied the relationship between individuals’ group social capital and their lending outcomes on the P2P market. The results showed that receiving the group leader’s endorsement for a borrowing request is viewed as a positive signal of the borrower’s credit quality among potential lenders, either within or outside the group. The payment factor and promise factor directly affect the borrower's repayment behavior. From the perspective of social capital, the economic behavior of market participants is closely related to their social relationships (Lin et al., 2013). The P2P friendship network is an important channel for borrowers to obtain financing through social relationships, and it plays a supervisory role in their credit (Chen et al., 2016). Bad repayment behaviors can lead to tensions among friends and can damage the borrower's social capital (Karlan, 2007). If the economic benefits of bad repayment are insufficient to compensate the borrower for his or her social capital loss, then the rational borrower will maintain a good credit history. Therefore, we conclude that the repayment and promise factors directly affect the borrower's repayment behavior and are significant correlated with the borrower's social capital and credit risk. The prospect factor effectively reflects the stability and sustainable development of the borrower's future economic situation. Individuals’ ability and willingness to repay are susceptible to uncertain events. Creditworthy borrowers might default in the future because of an emergency (Jiang et al., 2019). Therefore, we believe that the prospect factor is related to the dynamic change in the credit risk of borrowers. Based on the above theoretic argumentation, we propose the following hypotheses:

H1a: The 5P factors significantly affect borrowers’ financing performance. H1b: The 5P factors significantly affect borrowers’ default behavior. We further analyzed the explanatory power of the 5P factors on the credit risk of different types of borrowers. According to signal theory, when the lender perceives the borrower's credit risk through different signals, a high-cost signal is usually more relevant for credit quality judgments (Aggarwal et al., 2012). Repeat borrowers not only transmit risk signals through textual soft information but also hard information such as repayment history. Because some hard signals are costly to produce, they reflect credit quality more effectively. For example, the signal cost of repayment history is high because it needs a long time to maintain and operate. Therefore, the influence of 5P soft factors on the creditworthiness of borrowers can be diluted by hard signals. Because of the lack of hard signals (e.g., repayment history) among new borrowers, we conclude that the 5P factor model is more effective when judging the credit risk of new borrowers. Therefore, we propose the following hypotheses: H2a: Compared with repeat borrowers, the 5P factors have a stronger ability to explain the financing performance of new borrowers. H2b: Compared with repeat borrowers, the 5P factors have a stronger ability to explain the default behavior of new borrowers. 3.3 Credit evaluation model During the process of constructing and comparing credit evaluation models, we created models based on different feature sets and compared the performance of these models via a significance test. Before building models, we screened for features using a two-step feature selection method because of the weak explanatory ability of certain features or the existence

of redundant features. After feature selection, we constructed Model 1 based only on the features extracted from structured information. Then, Model 2 was constructed based on the structured features and nonsemantic features extracted from the loan description text. Finally, we constructed Model 3 based on the structured features, nonsemantic features and semantic features. We adopt this incremental approach to examine the credit risk prediction ability of each type of feature. We conducted a significance test of the performance index of the models and studied whether the text features significantly improved the performance of the credit evaluation model. The integration strategy significantly improves on the accuracy of the single algorithm-based credit evaluation model (Lessmann et al., 2015). Thus, we applied the AdaBoost integration strategy with better effect. The specific process of this strategy is as follows.

Input: Training set D  ( x1 , y1 ), ( x2 , y2 ), , ( xm , ym ) : // comment: xi  X is the feature set for credit risk analysis, including the structured features, nonsemantic features, and semantic features; yi  Y  (0 or 1) is the label of samples, where 0 represents the nondefault samples, and 1 represents the default samples. Base classifier hi (x) : Number of iterations T .

Processing: (1) Initialization: An initial weight D1 (i ) 

1 was given to each sample in D, and the m

training set of the first base classifier was sampled with equal probability. (2) Number of iterations t  1,2,, T 1) The training sample distribution Dt was generated from the training set D with probability D t (i ) , and the base classifier ht (x) was trained. 2) The training error  t 

EN of the base classifier ht (x) was calculated, EN  RN

where RN and EN represent the sample size of correct classification and the classification error, respectively. 3) If  t  0.5 , then return to step (1); otherwise, calculate the weight of ht (x) as

t 

1 t 1 Ln( ). 2 t 4) The weight of the sample was updated. In the

t  1 th base classifier, the weight of

each sample is

e  t D (i )  Dt 1 (i )  t   Zt  t e 

if if

ht ( xi )  yi ht ( xi )  yi

Dt (i ) exp(  t yi ht ( xi )) Zt

In the above expression, Z t represents the normalized factor.

Output: The classification results of each base classifier were integrated, and the integration result was

 T  H ( x)  s i g n  t ht ( x)   t 1  We chose three classification techniques widely used in the credit evaluation field as the base classifier of the AdaBoost integration strategy: logistic regression (LR), decision tree (DT), and the neural network (NN). Previous studies show that logistic regression (LR) is the best statistical analysis technology and the industry standard for creditworthiness analysis (Abdou and Pointon 2011; Lessmann et al., 2015). However, statistical analysis often requires strict assumptions such as that regarding a normal distribution, and the results are only valid under the condition of a large sample (Abdou and Pointon 2011). Machine learning is another effective approach to credit assessment that is not subject to the stringent assumptions required for statistical analyses. DT is a classical machine-learning technology whose result can be visualized. NN is a widely used machine-learning technology that can adapt to the

nonlinear relationships between variables well.

4 Data and research model 4.1 Data collection The data used in our study were collected from a famous P2P lending platform in China (Renrendai). Founded in 2010, Renrendai is a leading Chinese finance company online dedicated to providing quality and professional personal financial information services. As of March 31, 2018, the cumulative turnover of Renrendai exceeded 54.6 billion yuan with a total turnover of 729,000, serving more than 1.5 million lenders and borrowers. The data were collected between January 2012 and December 2016 and contain approximately 120,000 borrower records. A total of 26,468 successful loan applications were recorded. According to the application records, 7.68% reflect a failure to make a timely payment. Regarding the number of applications, 67,011 records were the first loans of new borrowers on the platform; the remaining records belonged to multiple loans and represented repeat borrowers. After the data were collected, the raw data were preprocessed. Because only 162 records in our dataset contained missing values, deleting these records would not affect the data distribution of each attribute value. Therefore, we deleted records with missing values. In addition, we did not balance the raw data, although the numbers of creditworthy users and users with bad credit were not balanced. Although balancing the data affects the absolute performance of the classifier, in this paper, we compared the relative performance of the classifiers before and after adding semantic features. This processing method with unbalanced data does not affect our conclusions, and it is consistent with the data processing method of

Lessmann et al. (2015). We used the tenfold cross-validation method to divide the training and test sets during the research. 4.2 Feature extraction We extracted 32 features, 20 of which were extracted from structured information (F A) and 12 of which were extracted from soft information. These soft information features were extracted from loan description text, which was further divided into 7 nonsemantic features (FB) and 5 semantic features (FC). Specifically, the features in the F A group reflected the financial status, loan history, and demographic information of borrowers. The features in the FB group reflected the readability, sentiment and length of the loan description text. The features in the FC group reflected the 5P (personal, purpose, payment, promise and prospect) semantic information of the loan description text. Table 1 shows the results of semantic analysis of the loan description text using the word embedding model and the 5P credit evaluation method. The first row in Table 1 shows the 5P analysis dimensions; the second row presents the seed words for each dimension, and the third through seventh rows show the top five words with the highest similarity to the seed words for each dimension with their similarity values. We used the total word frequency of all words in each group as the semantic feature of the loan description text. The selection of seed words is an important part of a 5P semantic analysis. We chose seed words for the semantic analysis in three steps. Step 1: For each semantic dimension of 5P, an initial word that represents the semantics of the dimension was randomly selected from the loan description text corpus (e.g., for the personal dimension, we chose “integrity” as the initial word). Step 2: The word embedding model was used to analyze the top N words in the

corpus that had the highest semantic similarity with the initial word (in this case, “integrity”). Step 3: We analyzed the word frequency of these N words with regard to the loan description text corpus and selected the first three words with the highest word frequency as the seed words. The high frequency of words indicated that people use these words most often when expressing the semantics of a certain dimension. The experiment revealed that when N is large enough, the selected seed words are consistent. For example, in the first 100 words and the first 200 words with the highest semantic similarity to the initial word (e.g., “integrity”), the three words with the highest word frequency were the same because lower semantic similarity rankings denoted less frequent use of these words to express the conceptual category of a certain semantic dimension. Furthermore, when N was large enough, the final selected seed words were not affected by the initial word randomly selected during the first step. For example, for the personal dimension, whether the initial word was “integrity” or another word that expresses personal characteristics, when N was large enough, the selected seed words were consistent. This result shows that the method has stability. In our research, we chose N=100 because this value ensured a stable seed word selection result. After determining the seed words, we continued to use the word embedding model to analyze words with semantic similarity to the seed words and use these words as the basis of the 5P semantic feature extraction. The semantic similarity threshold setting was the key to this analysis. If the threshold is set too high, then the words representing the semantic dimension of 5P cannot be fully searched. However, if the threshold is set too low, then some words might appear in multiple semantic dimensions, causing confusion because certain words have different semantic meanings in different contexts. In our research, we found that

0.5 was an adequate semantic similarity threshold because it fully searched the vocabulary representing the semantic dimension of 5P and ensured that the words searched belonged to a specific semantic dimension. Table 1 The semantic features extracted using the 5P analysis method 5P analysis

Personal factor

Purpose factor

Payment factor

买车、资金周转、

Seed words

诚实、友好、正直

家电

利润、收入、收益

(honest, friendly,

(buy cars, capital

(profit, income,

upright)

turnover,domestic a

revenue)

ppliance)

1

warranties)

(prospect, future, development)

按时

发展前景

(trustworthy)

(consumer goods)

(profit margin)

(on time)

(foreground)

(0.7670)

(0.8943)

(0.7442)

(0.9073)

(0.8947)

购车

盈利

按期

(purchase a car)

( profitable)

(on schedule)

(0.8884)

(0.7430)

(0.8684)

销售额

如期

创造价值

(sales)

(as scheduled)

(creation of value)

(0.7169)

(0.7983)

(0.7979)

(keep one's word)

(steady) (0.7046)

婚礼筹办 (wedding preparation) (0.8023)

真诚待人

买房

营业额

必定

(sincere)

(by a house)

(turnover)

(must)

(0.7045)

(0.7518)

(0.6917)

(0.7859)

守时

5

guarantees,

前景、未来、发展

利润率

稳重

4

(promises,

日用消费

(0.7278)

3

承诺、保证、担保

Prospect factor

守信

一言九鼎

2

Promise factor

(be punctual) (0.6923)

购买设备 (purchasing equipment) (0.6988)

市场前景 (market expectation) (0.8736)

发展潜力 (development potential) (07745)

工资收入

付息

利润目标

(salary)

(interest payment)

(profit target)

(0.6901)

(0.7715)

(0.7702)

Importantly, Chinese is a high-context language. In a culture with a high-context language, people pay attention to context rather than content during communication. Words with similar literal meanings, even the same words, might have essential differences across different contexts. This difference is also reflected in the extraction and learning of lexical semantics. Because the word embedding model analyzes semantics within a specific context,

the results better reflect the accurate meaning of the vocabulary in a high-context culture. For example, although “守时” (to be punctual) and “按时” (to be on time) are translated into English with similar literal meanings, in the Chinese context, “守时” indicates that a person has a strong sense of time in everyday life, reflecting his or her personality and morality, whereas used in a Chinese loan description text, “按时” shows that a person is able to repay on schedule, reflecting the promise factor of a borrower’s repayment ability. In other words, words with similar literal meanings, combined with the Chinese context, are categorized into different 5P semantic dimensions by the word embedding model because this model combines contextual information when analyzing lexical semantics. Therefore, our semantic analysis results reflect the high contextual characteristics of Chinese. Table 2 shows all of the credit features extracted, the range of values and a related description of these features.

Features

Table 2 Credit features used in analysis Ranges Features

Structured features (FA)

F18 Industry

Ranges 0-20 (Integer)

F1 Loan amount

3000-3000000(Yuan) F19 Company Size

0-4 (Integer)

F2 Annual interest rate

0.03-0.244

F20 Working years

0-4 (Integer)

F3 Repayment period

1-36(Month)

Non-semantic features (FB)

F4 Way of guarantee

0, 1

F21 Number of character

0-482 (Integer)

F5 Prepayment rates

0-1

F22 Number of word

0-243 (Integer)

F6 Age

25-74 (Integer)

F23 Number of sentence

0-69 (Integer)

F7 Education

0-4 (Integer)

F24 Readability

0-444 (Integer)

F8 Marriage

0-3 (Integer)

F25 Sentiment

0-1

F9 Number of application

1-148 (Integer)

F26 Title characters

2-24 (Integer)

F10 Line of credit

0-3000000(Yuan)

F27 Title words

0-16 (Integer)

F11 Successful application

1-144 (Integer)

Semantic features (FC)

F12 Total amount of loan history

3000-9000000(Yuan) F28 Personal factor

0-10 (Integer)

F13 Income level

0-7 (Integer)

F29 Purpose factor

0-5 (Integer)

F14 House

yes-1,no-0

F30 Payment factor

0-8 (Integer)

F15 Housing loan

yes-1,no-0

F31 promise factor

0-11 (Integer)

F16 Car

yes-1,no-0

F32 prospect factor

0-11 (Integer)

F17 Car loan

yes-1,no-0

Note: The way of guarantee (F4) includes the principal guarantee-0 and the user benefit guarantee mechanism-1 provided by the online lending platform. Education (F 7), marriage (F8), income (F13), industry (F18), company size (F19), and years of employment (F20) are multicategory variables. For example, feature F 8 is scored as 0, 1, 2, or 3, representing unmarried, divorced, married, and widowed, respectively. Readability (F 24) is measured based on the number of characters misused in the loan description text, where larger values denote poorer readability.

In Table 2, some of structured features appear to contain same information as the semantic features and in fact they are significantly different. For example, both F 3 (repayment period) and F30 (payment factor) seem to reflect repayment history. Actually, F30 contains more credit risk information than repayment history. In another words, repayment history is not sufficient to explain the credit risk of the borrower. Borrowers with good repayment histories may also default in the future. From a realistic point of view, the borrower's repayment history before the first default is always good. From a theoretical point of view, the survival analysis theory notes that even users who are not currently in default have potential risks (Jiang et al., 2019). Payment factors in the loan description text can supplement information such as repayment history. Some borrowers further describe the source of repayment and the reasons for late repayment in the payment factor, which can also be very helpful in determining credit risk. For example, some borrowers may have a bad repayment history due to accidental factors or force majeure, but by describing the payment factors in the loan description text, it can be determined that the borrower has a stable and reliable source of funds, which can play the role of credit enhancement. However, although some borrowers performed well in the history of repayment, they did not effectively explain the payment factors, which may mean that they subconsciously have no confidence in the payment factors, such as the source of repayment. Therefore, the payment factors can be an effective signal in credit risk analysis. In a broader sense, P2P platforms may be poorly designed in terms of the types of structured information collected, analyzed, and displayed

due to bounded rationality and a lack of comprehensive consideration of the required structured information. Borrowers are unable to find a suitable module on P2P websites to deliver the useful signals they wish to display (such as the source of repayment). These signals can be supplemented by loan description text. Obtaining these signals can help the lender to further judge the credit risk of the borrower. 4.3 Model selection We used the AdaBoost integration strategy to combine the base classifiers to improve the performance of the credit evaluation model. The base classifiers included logistic regression (LR), a C4.5 decision tree (C4.5) and backpropagation (BP) neural networks. In this study, we chose WEKA data mining tools. The logistic regression (LR) model uses the maximum likelihood estimation method to establish a linear regression classification model to classify binary or multinomial variables. Assuming that the dependent variable, y, is a binary variable (0, 1), then x is the independent variable. In the default prediction model, borrower default is a categorical variable, and the credit risk features are explanatory variables. C4.5, an extension of the ID3 algorithm, is a decision-tree building algorithm that adopts a divide-and-conquer strategy and the entropy measure for object classification. Its goal is to classify mixed objects into their associated classes based on the objects’ attribute values. In WEKA data mining tools, a standard C4.5 algorithm is implemented. Backpropagation neural networks are popular for their unique learning capability. WEKA data mining tools provided a standard three-layer, fully connected backpropagation neural network. We chose the heuristic number, i.e., (the number of nodes in the input layer +

the number of nodes in output layer)/2, as the number of nodes in the hidden level. The input layer nodes are credit risk features, and the output nodes are creditworthiness status. 4.4 Performance evaluation criteria To compare the result of the credit assessment models, we used evaluation criteria based on a confusion matrix commonly adopted in data mining and credit analyses. Because the misclassification cost of having bad credit is significantly higher than that of being creditworthy, we calculated the identification performance of having bad and good credit separately. The calculation formulas are defined using the following equations. The true positive rate (TPR, i.e., sensitivity) is the percentage of correctly classified cases of creditworthiness: TPR =

True Positive True Positive + False Negative

The true negative rate (TNR, i.e., specificity) is the percentage of correctly classified cases of bad credit: TNR =

True Negative True Negative + False Positive

The meanings of the items in the TPR and TNR formulas are as follows: True positive (TP): the number of positive examples classified as positive. False positive (FP): the number of negative examples classified as positive (i.e., type II error). True negative (TN): the number of negative examples classified as negative. False negative (FN): the number of positive examples classified as negative (i.e., type I error). We also constructed the evaluation criterion to measure the misclassification costs of

credit assessment models based on Lessmann et al. (2015). Hofmann reported that the ratio of misclassification costs, which is associated with type II and type I errors, is 5:1, which was used by Abdou (2009). In this paper, this relative ratio was used to construct the cost indicator that represents the misclassification cost of a credit risk evaluation model. Therefore, we constructed the cost as follows: Cost = 5  FPR  FNR

FP FN , FNR  FP  TN FN  TP

FPR 

5 Results and discussion 5.1 Funding success analysis We first analyzed the effect of various features on the success of the loan. The results are shown in Table 3. In this section, we chose the logistic regression model instead of intelligent learning algorithms such as a neural network because the borrower is more concerned about the factors that improve the probability of successful borrowing than whether the model accurately predicts the probability of successful borrowing. This decision is consistent with the method adopted by Cai et al. (2016) when analyzing the success of the loan. In addition, in this part, we attempt to explore all of the features that affect the probability of successful borrowing so that we do not screen for the features. Table 3 shows the results of the SF.I and SF.II logistic regression models for new and repeat borrowers. Table 3 Funding success analysis via a logistic regression SF.I SF.II Features B Sig. B F1 Loan amount F2 Annual interest rate

-65.693

***

***

7.523

**

F3 Repayment period

-0.023

F4 Way of guarantee

19.133

0.000 0.000

F7 Education

-52.711

***

6.034

***

0.489

0.000 0.000

0.112

5.566

0.127

32.004 ***

0.000

-0.027

***

0.001

Sig.

0.032

F5 prepayment rates F6 Age

***

0.191 0.000

0.000

***

0.000

***

0.000

0.026 0.224

F8 Marriage

**

0.186

0.021

78.607

***

0.000

F14 House

0.000

***

0.000

0.000

***

0.280

0.000

0.83

-0.029

0.551

0.352 0.231

0.036 *

0.071

-0.225

**

0.048

0.252

F15 Housing loan

-0.027

F16 Car

0.329

F17 Car loan

-0.925

F18 Industry F19 Company Size F20 Working years F21 Number of characters

0.000

43.903 0.611

F12 Total amount of loan history F13 Income level

0.000

***

***

F11 Successful application ***

**

***

***

0.632

0.007

0.557

0.007

0.235

0.085

0.201

-0.017

0.398

0.000

***

0.000

***

0.000

**

0.033

***

0.436

**

0.042

**

0.012

0.291 0.024

0.046

0.017

F23 Number of sentences

0.005

0.499

0.004

F26 Title characters F27 Title words F28 Personal factor F29 Purpose factor

0.000

0.038

0.029

F25 Sentiment

0.000

0.027

F22 Number of words F24 Readability

0.000

***

-0.253

F9 Number of application F10 Line of credit

***

0.134

**

-0.01

0.031

0.073

0.666

0.238 **

-0.007

0.028

0.095

0.355

***

0.004

*

0.086

0.017

0.719

*

0.092

0.001

0.973

0.104

0.119 0.032

**

0.059

***

0.071

***

0.077

0.000

0.000

0.002

0.037

0.148

F31 promise factor

*

0.074

0.077

**

0.03

0.019

F32 prospect factor

0.029

0.823

0.034

0.288

F30 Payment factor

0.218

***

0.045

Among the features extracted by structural information, F 2 (annual interest rate), F7 (education) and F13 (income level) were significantly and positively correlated with the loan success probability of new and repeat borrowers, whereas F 1 (loan amount) and F3 (repayment period) were significantly and negatively correlated with the loan success probability of new and repeat borrowers. We focus on the effect that textual information-based features have on the probability of successful borrowing. Table 3 shows that F 21 (number of characters), F22 (number of words) and F26 (title characters) were significantly and positively correlated with the success rate of borrowing because, in general, long text reduces the information asymmetry between borrowers and lenders to a greater extent and enhances lenders' trust in borrowers. However, no significant correlation was found between F23 (number of sentences)

and the success rate of a loan, which might be caused by people's habits of using informal expressions in the network environment. People generally do not use formal sentence patterns or punctuation in online texts. F24 (readability) was significantly and negatively correlated with the probability of successful borrowing among new and repeat borrowers because readability (F24) was measured as the number of characters misused in the loan description text, where larger feature values denoted poorer readability. Dorfleitner et al. (2016) showed that poor readability indirectly indicates a borrower's low level of education, and length of education is positively correlated with the probability of successful borrowing. Regarding the textual semantic features, F 29 (purpose factor) and F31 (promise factor) were significantly and positively correlated with the probability of borrowing success. Thus, the explanation of the purpose of borrowing and the guarantee of one’s repayment ability in the loan description text help to improve the probability of borrowing success because investors are able to learn of the likelihood of repayment through these pieces of information and reduce their perceptions of risk. We also found that certain features have different effects on the probability of successful borrowing among new and repeat borrowers. For example, F28 (personal factor) and F30 (payment factor) significantly and positively affect the borrowing success of new borrowers. However, these features do not significantly affect the probability of successful borrowing among repeat borrowers because investors are able to judge repeat borrowers’ ability and willingness to repay through more effective signals such as F 11 (successful application) and F12 (total amount of loan history). According to signal theory, strong signals (F11 and F12) mask the effects of weak signals (F28 and F30). The above results support hypotheses H1a and H2a.

5.2 Default risk analysis We built an additional default risk model for successful borrowers. To improve the performance of the model, feature selection was needed. The feature selection process primarily includes two steps. In the first step, we considered the contribution of each feature with regard to the interpretation of the target variable. The second step was to consider the correlations and redundancies between the various features. Specifically, in the first step, we used three indicators, i.e., information gain (IG), information gain rate (GR), and chi-square (CS) value, to measure the importance of each feature to the target variable (repayment on time or not). These three methods are suitable for feature selection across different variable relationships and data distribution environments. Because of the uncertainty of the data distribution and variable relationships (including numerous nonlinear relationships) in the P2P environment, it is difficult to measure the importance of each credit feature to the target variable using a specific feature selection method. Therefore, we used the comprehensive ranking of the three indicators to measure the importance of credit features. The comprehensive ranking was calculated by sorting the average of the three rankings. The sorting results are shown in Table 4. Table 4 Importance ranking result of features

Features

IG

GR

CS

F2

2

3

4

F10

7

5

F12

1

F1

3

F9

Comprehensive

Comprehensive

Features

IG

GR

CS

1

F21

8

24

14

17

1

2

F28

24

14

9

18

13

3

3

F23

14

22

11

19

10

7

4

F32

25

20

8

20

10

7

5

5

F26

19

25

17

21

F3

6

4

21

6

F16

20

15

29

22

F18

11

12

10

7

F27

22

26

18

23

F25

12

8

15

8

F17

30

18

22

24

F5

4

1

31

9

F14

21

17

32

25

Rank

rank

F4

5

2

30

10

F15

27

21

26

26

F11

16

16

6

11

F13

23

27

25

27

F24

18

19

2

12

F8

29

28

20

28

F31

15

11

13

13

F20

26

29

28

29

F29

17

6

16

14

F19

31

30

23

30

F30

13

9

19

15

F7

32

31

24

31

F22

9

23

12

16

F6

28

32

27

32

In the second step, we further examined the redundancy among credit features. We input all of the credit features into evaluation models and gradually removed the credit features that were ranked lower (unimportant) in step 1 until the model performance was optimized. The result is shown in Fig. 3. The horizontal axis of Fig. 3 represents the number of features in each model, and the vertical axis represents the misclassification cost of each model. We found that when the first 23 features are retained, the misclassification cost of the models have stabilized. In other words, increasing the number of features does not further reduce the misclassification cost of the models. Therefore, we chose the top 23 features in Table 4 as the final feature set. We found that the deleted features were all structural (i.e., in feature set F A). Text features (feature sets FB and FC) were not deleted.

Fig.3. Feature selection based on redundancy examination

Next, we used the above 23 features to build credit evaluation models. For new borrowers, we combined different evaluation models and feature sets to judge their credit

status (see Table 5). As Table 5 shows, with the addition of new feature sets, the performance of each credit evaluation model was gradually enhanced. For example, for model AdaBoost + C4.5, when the feature set FB (extracted from the nonsemantic information of loan description text) was added, the model’s recognition accuracy concerning default and nondefault users (i.e.,

TNR

and

TPR)

increased

by

10.86%

([0.347-0.313]/0.313)

and

1.29%

([0.939-0.927]/0.927), respectively, whereas the model’s misclassification cost was reduced by 5.19% ([3.508-3.326]/3.508). Furthermore, when the feature set F C (extracted from the semantic information of the loan description text) was added, the model’s TNR and TPR increased by 30.67% ([0.409-0.313]/0.313) and 2.27% ([0.948-0.927]/0.927), respectively, whereas the model’s misclassification cost was reduced by 14.28% ([3.508-3.007]/3.508). We found similar results for two additional models (AdaBoost+BP and AdaBoost+LR). The above results show that text-related soft information helps to judge the credit status of new borrowers. In particular, the semantic information more effectively reduces the misclassification cost of the credit evaluation model. We combined signal theory to explain the reasons for the above results. Adding soft information to the loan description text helped to improve the accuracy of the credit evaluation model because the semantic and nonsemantic information of the loan description text acts as a signal to judge the borrower's credit risk. The hard information signal used to judge creditworthiness is limited, especially for new borrowers. Currently, loan description text might become an important signal for investors to perceive the credit risk of borrowers. Table 5 Discrimination performance of the models based on three feature sets (new borrowers) New

borro

wers

Feature set

Adaboost+C4.5 TPR

TNR

Cost

Adaboost+BP TPR

TNR

Cost

Adaboost+LR TPR

TNR

Cost

FA

0.927

0.313

3.508

0.931

0.305

3.544

0.936

0.188

4.124

FA+FB

0.939

0.347

3.326

0.942

0.321

3.453

0.961

0.212

3.979

FA+FB+FC

0.948

0.409

3.007

0.946

0.374

3.184

0.965

0.308

3.495

We also analyzed the effect of each semantic feature in feature set F C on the performance of the credit evaluation model to judge the credit status of new borrowers (see Table 6). These semantic features include personal factors (F 28), purpose factors (F29), payment factors (F30), promise factors (F31) and prospect factors (F32). Table 6 shows that the promise factor had the greatest influence on the performance indicators (TPR, TNR and Cost) of the credit evaluation model, whereas the prospect factor has the weakest influence on the performance indicators of the model. This result might be because the prospect factor reflects the rate of return when the loan is used for investment. Regarding small and micro loans, however, most people borrow money for consumption rather than investment; furthermore, borrowers describe fewer prospect factors. Understanding the prospect factors also alleviated less information asymmetry. Table 6 Discrimination performance of each semantic feature (new borrowers) Feature set

Adaboost+C4.5

Adaboost+BP

Adaboost+LR

New borrowers

TPR

TNR

Cost

TPR

TNR

Cost

TPR

TNR

Cost

FA+FB

0.939

0.347

3.326

0.942

0.321

3.453

0.961

0.212

3.979

FA+FB+F28

0.942

0.362

3.248

0.943

0.329

3.412

0.961

0.232

3.879

FA+FB+F29

0.943

0.376

3.177

0.946

0.344

3.334

0.961

0.258

3.749

FA+FB+F30

0.943

0.371

3.202

0.943

0.336

3.377

0.962

0.243

3.823

FA+FB+F31

0.946

0.376

3.174

0.946

0.347

3.319

0.964

0.269

3.691

FA+FB+F32

0.940

0.356

3.280

0.942

0.324

3.438

0.961

0.223

3.924

We also combined different evaluation models and feature sets to judge the credit status of repeat borrowers (see Table 7). As Table 7 shows, with the addition of new feature sets, the performance of each credit evaluation model was gradually enhanced. For example, for model AdaBoost + LR, when the feature set F B (extracted from the nonsemantic information of the loan description text) was added, the model’s recognition accuracy regarding default and

nondefault users (TNR and TPR, respectively) increased by 11.06% ([0.231-0.208]/0.208) and 2.73% ([0.979-0.953]/0.953), respectively, whereas the model’s misclassification cost was reduced by 3.52% ([4.007-3.866]/4.007). Furthermore, when the feature set F C (extracted from the semantic information of the loan description text) was added, the model’s TNR and TPR were increased by 52.88% ([0.318-0.208]/0.208) and 2.94% ([0.981-0.953]/0.953), respectively, whereas the model’s misclassification cost was reduced by 14.42% ([4.007-3.429]/4.007). We found similar results regarding two other models (AdaBoost+C4.5 and AdaBoost+BP). The above results show that text-related soft information also helps to judge the credit status of repeat borrowers. In particular, semantic information more effectively reduces the misclassification cost of the credit evaluation model. In addition, compared with Tables 5 and 7, we found that feature sets F B and FC more clearly improve the performance of the evaluation model when predicting the credit status of new borrowers. For example, when model AdaBoost+BP was used to predict the credit status of new borrowers, adding feature set FC reduced the overall misclassification cost (Cost) of the model by 0.269 (3.453-3.184). However, when the model was used to predict the credit status of repeat borrowers, the addition of feature set F C only reduced the overall misclassification cost (Cost) of the model by 0.228 (3.212-2.984). From the perspective of signal theory, this result might be because repeat borrowers provide more hard information signals (e.g., loan records and repayment history); these hard information signals are more expensive to obtain and more helpful when judging the credit risk of borrowers. Signal theory states that a signal might be costly if it takes a long time to build and maintain. In P2P platforms, the positive repayment history of repeat borrowers might serve as a costly signal of

their trustworthiness because it takes time to build and maintain. The above results support hypotheses H1b and H2b. Furthermore, we analyzed the effect of each semantic feature in feature set F C on the performance of the credit evaluation model to judge the credit status of repeat borrowers (see Table 8). Table 8 shows that the promise factor (F31) had the greatest influence on the performance indicators (TPR, TNR and Cost) of the credit evaluation model, whereas the prospect factor (F32) had the weakest influence on the performance indicators of the model. This result is similar to the prediction of new borrowers’ credit status. Table 7 Discrimination performance of the models based on three feature sets (repeat borrowers) Repeat borrowers

Feature set

Adaboost+C4.5

Adaboost+BP

Adaboost+LR

TPR

TNR

Cost

TPR

TNR

Cost

TPR

TNR

Cost

FA

0.948

0.406

3.022

0.952

0.352

3.288

0.953

0.208

4.007

FA+FB

0.957

0.432

2.883

0.958

0.366

3.212

0.979

0.231

3.866

FA+FB+FC

0.963

0.485

2.612

0.961

0.411

2.984

0.981

0.318

3.429

Table 8 Discrimination performance of each semantic feature (repeat borrowers) Feature set

Adaboost+C4.5

Adaboost+BP

Adaboost+LR

Repeat borrowers

TPR

TNR

Cost

TPR

TNR

Cost

TPR

TNR

Cost

FA+FB

0.957

0.432

2.883

0.958

0.366

3.212

0.979

0.231

3.866

FA+FB+F28

0.959

0.441

2.836

0.959

0.370

3.191

0.979

0.243

3.806

FA+FB+F29

0.958

0.454

2.772

0.961

0.379

3.144

0.981

0.265

3.694

FA+FB+F30

0.959

0.449

2.796

0.960

0.373

3.175

0.980

0.251

3.765

FA+FB+F31

0.962

0.462

2.728

0.962

0.382

3.128

0.982

0.279

3.623

FA+FB+F32

0.958

0.436

2.862

0.958

0.368

3.202

0.979

0.236

3.841

To study whether text-related soft information significantly improves the performance of the credit evaluation model, we use a paired t-test to analyze whether the misclassification cost of the model is significant reduced when adding feature sets F B and/or FC. The results are shown in Table 9. We only performed a t-test on the Cost indicator because Cost comprehensively reflects the overall performance of the model when identifying both default

and nondefault users. The TPR (or TNR) only reflects the performance of the model in identifying nondefault (or default) users. In other words, a t-test of Cost should comprehensively reflect whether the overall performance improvement of the model is significant when identifying two types of users. However, t-tests on TPR or TNR only reflect whether the performance of the model is significantly improved when identifying a certain type of users, which has a certain one-sidedness. Table 9 shows that for both new and repeat borrowers, the misclassification cost of the credit evaluation model decreased significantly when we added feature sets FB and/or FC. Additionally, feature set FC had a more significant effect on the models’ Cost values. Table 9 Pairwise t-tests on accuracy across different feature sets (p<0.1*, p<0.05**, p<0.01***) Adaboost+C4.5

Adaboost+BP

Adaboost+LR

New borrowers Feature set

t

p

t

p

t

p

*

2.698

0.024

**

0.002

*

2.177

0.057

**

**

FA
2.161

0.059

FA+FB
2.948

0.016

3.179

0.011

4.385

t

p

t

p

t

p

FA
2.250

0.051

*

1.871

0.094

*

2.049

0.071

FA+FB
2.620

0.028

**

2.316

0.046

**

3.638

0.005

***

Repeat borrowers Feature set

*

***

Although the performance of the credit evaluation model in predicting the credit status of new and repeat borrowers was improved after the text-related soft information was considered, the degree of improvement significantly differed. In Table 10, ∆Cost1 (∆Cost2) represents the decrease in misclassification cost when the evaluation model added feature set F B (feature set FC) to predict the credit status of new borrowers. ∆Cost1' (∆Cost2') represents the decrease in misclassification cost when the evaluation model added feature set F B (feature set FC) to

predict the credit status of repeat borrowers. Table 10 shows that after adding feature sets F B and FC, the performance of the credit evaluation model identifying new borrower credit status significantly improved compared with that of the model identifying repeat borrower credit status. According to signal theory, this phenomenon might be because repeat borrowers provide more hard information (e.g., repayment history); these hard information signals are costly and more effective when judging the credit risk of borrowers. Therefore, the role that text-related soft information plays in relieving information asymmetry is relatively limited, and the performance improvement of the credit evaluation model is very limited. Table 10 Pairwise t-tests on accuracy by degree of improvement (p<0.1*, p<0.05**, p<0.01***) Adaboost+C4.5 t

p

Adaboost+BP t

Adaboost+LR

p

t

p

**

1.843

0.098

**

2.232

0.053

***

3.055

0.014

*

2.384

0.041

∆Cost1'<∆Cost1

4.508

0.001

∆Cost2'<∆Cost2

2.153

0.060

*

*

6 Conclusions This paper examined whether the text-related soft information on Chinese P2P platforms is conducive to predicting the credit risk level of different types of borrowers. We focused on the explanatory ability of the semantic information associated with the credit status of various borrowers. This paper used the word embedding model and 5P theory to extract semantic features from 5 dimensions and added these features to a credit evaluation model to accurately reflect the ability and willingness of borrowers to repay. The AdaBoost ensemble learning strategy was applied to construct the credit evaluation model to improve the learning performance of a single intelligent algorithm. The empirical results concerning the Renrendai platform show that text-related soft information can be used to effectively predict the credit status of new and repeat borrowers.

The proposed semantic soft factor mining method aims to identify different patterns of semantic representation by good (nondefault) versus bad (default) borrowers in terms of the distribution of the kinds of semantics expressed in the loan description text. Certain aspects of content that a borrower wants to express are presented with a set of keywords. Different borrowers may use different keywords to express certain aspects of content, but the semantics should be similar, thus forming a group of semantically related keywords. Accordingly, semantic analysis is utilized to measure the semantic similarities among different keywords, and semantically related keywords are subsequently grouped together in the form of semantic features. The semantics of loan description text are then represented as a set of semantic soft factors, each of which corresponds to a semantic feature by integrating all the terms in the corresponding semantic features. Such semantic soft factors can then be used to complement the traditional hard factors in credit risk evaluation. The word embedding model greatly improved the efficiency of manually extracting the semantic information in loan description text. For example, Wang and Lin (2015) extracted 2,000 pieces of semantic information that revealed the personality traits of borrowers through random sampling and manual reading. In our study, the semantic information of five dimensions across 120,000 loan descriptions (600,000 eigenvalues in total) was extracted using the word embedding model and 5P credit evaluation theory. In addition, compared with the classic LDA topic-level model, the word embedding model can be used to further analyze lexical-level semantic information and is more suitable for the semantic feature extraction of brief texts in the context of big data. Loan description text is typically brief. Numerous studies have shown that the direct application of the LDA model to brief text faces serious sparse

problems due to the lack of data points. The brief length of text and the insufficient information amount of word co-occurrence makes it difficult for the LDA model to use word co-occurrence information to infer topics at the document level (Yan et al., 2013). The word embedding model used in this paper uses an artificial neural network and the hierarchical Softmax training method to learn the meaning of words within a specific context, as well as the semantic relationship between different words from numerous corpus collections. In addition, this paper used 5P credit evaluation theory to determine the semantic dimension of the loan description text, which has a solid theoretical basis and avoids the uncertainty of the LDA model in determining the number and type of topics. This research has several implications for the literature.

First, this paper expands on the research regarding the effect of soft information on financing performance and default behavior, as well as confirms the importance of soft information with regard to credit evaluation. This study revealed that text soft information has a different explanatory ability for the credit risks of different types of borrowers and that it is necessary to build a fine-grained credit evaluation model for specific types of borrowers. The conclusions of this paper are a useful and necessary step for future research to fully understand the role of soft information. Second, this paper combines the 5P credit evaluation method and the word embedding model to extract credit features from the semantics of loan descriptive texts, enriches the research of credit feature extraction, and further strengthens the comprehensive application of the relevant methods in the field of credit evaluation. In addition, most of the existing literature uses the 5P factors to qualitatively analyze credit risk. This study has carried out

theoretical demonstration and comprehensive quantitative analysis of the relationship between 5P factors and credit risk, as well as deepened the existing related literature. Third, this paper adds new empirical evidence for the applicability of signal theory to the field of online lending. From the perspective of signal cost, we clarify the reasons for the differences in the effectiveness of various types of signals across different scenarios. Our research confirms that costly signals have a greater effect on the evaluation of borrowers’ credit risk. The soft signal of text is more effective during first-time borrowing scenarios, and its effectiveness is weakened by the costly hard signals associated with repeat borrowing scenarios. Fourth, and more generally, credit evaluation information within the big data environment is usually heterogeneous. This study provides new ideas and methods to organize unstructured credit evaluation information, as well as integrate and model heterogeneous evaluation information. This research also provides several practical implications.

First, this study might help different types of borrowers on Chinese P2P platforms understand the relationship between textual soft information and financial performance. New borrowers who lack hard signals should be especially successful at transmitting risk signals through textual soft information, reducing the information asymmetry between borrowers and lenders, and promoting the occurrence of lending transactions. Second, this study has implications for improving the presentation of credit information and constructing a targeted credit evaluation model for P2P platforms. P2P platforms should focus on improving the presentation of soft information. For example, in addition to loan

description text, platforms might also display a review text between the borrower and the lender using a clear and friendly interface (which is also an important signal of credit risk). P2P platforms might also design a fine-grained credit evaluation model for different types of borrowers to control the overall risk of the platform and create a healthy investment environment. Third, investors should attempt to understand the credit risk of borrowers through textual soft information. For example, they might use the platform's loan description text or timely communication tools to understand the borrower's purpose, payment and promise factors to better grasp investment opportunities and control investment risks. Fourth, in a broader sense, because credit evaluation has penetrated various types of businesses, Internet financing scenarios, and social lives (e.g., crowdfunding, e-commerce, renting, traveling, and job searching), the results of this paper should inspire the construction of a credit evaluation system in the context of other financial- and life-service scenarios. This study has several limitation. At present, we only analyzed the credit data found on Renrendai, and relevant conclusions must be verified using other online lending platforms. In the future, we will conduct empirical research on different online lending platforms to further verify the effectiveness and universality of the proposed method. Additionally, for extreme cases of fraud (i.e., borrowers write content that is not real), which typically comprise a small proportion of P2P loan applications, the proposed method may lose some effectiveness. Future research may consider introducing fraud detection methods (e.g., Carneiro et al., 2017) to further improve predictive performance.

Acknowledgements

This research is supported by the Science Foundation of the Ministry of Education of China (No.18YJC630082), National Natural Science Foundation of China (No.71731005), and the Natural Science Foundation of Anhui Province (No.1908085QG307; No.1808085MF170).

References Abbadi., Suleiman M, S.M.A.K., 2013. Methods of Evaluating Credit Risk used by Commercial BANK.pdf. Int. Res. J. Financ. Econ. 111, 146–159. Abdou, H.A., 2009. An evaluation of alternative scoring models in private banking. J. Risk Financ. 10, 38–53. https://doi.org/10.1108/15265940910924481 Abdou, H.A., Pointon, J., 2011. Credit Scoring, Statistical Techniques and Evaluation Criteria: a Review

of

the

Literature.

Intell.

Syst.

Accounting,

Financ.

Manag.

18,

59–88.

https://doi.org/10.1002/isaf.325 Aggarwal, R., Gopal, R., Gupta, A., Singh, H., 2012. Putting money where the mouths are: The relation between venture financing and electronic Word-of-Mouth. Inf. Syst. Res. 23, 976–992. https://doi.org/10.1287/isre.1110.0402 Ali, F., Kwak, K.S., Kim, Y.G., 2016. Opinion mining based on fuzzy domain ontology and Support Vector Machine: A proposal to automate online review classification. Appl. Soft Comput. J. 47, 235–250. https://doi.org/10.1016/j.asoc.2016.06.003 Baesens, B., Gestel, T. Van, Viaene, S., Stepanova, M., Suykens, J., Vanthienen, J., 2003. Benchmarking state-of-the-art classification algorithms for credit scoring. J. Oper. Res. Soc. 54, 627–635. Basirat, A., Nivre, J., 2017. Real-valued Syntactic Word Vectors (RSV) for Greedy Neural Dependency Parsing. InProceedings of the 21st Nordic Conference on Computational Linguistics, NoDaLiDa. pp. 23–24. Bachmann, A., Becker, A., Buerckner, D., Hilker, M., Kock, F., Lehmann, M., Tiburtius, P., Funk, B., 2011. Online peer-to-peer lending - A literature review. J. Internet Bank. Commer. 16, 1–18. Bhatt, N., Tang, S.Y., 2002. Determinants of repayment in microcredit: Evidence from programs in the

United

States.

Int.

J.

Urban

Reg.

Res.

26,

360–376.

https://doi.org/10.1111/1468-2427.00384 Bird, R.B., Smith, E.A., 2005. Signaling theory, strategic interaction, and symbolic capital. Curr. Anthropol. 46, 221–248. https://doi.org/10.1086/427115 Burtch, G., Ghose, A., Wattal, S., 2014. Cultural differences and geography as determinants of online

prosocial

lending.

MIS

Q.

Manag.

Inf.

Syst.

38,

773–794.

https://doi.org/10.25300/MISQ/2014/38.3.07 Cai, S., Lin, X., Xu, D., Fu, X., 2016. Judging online peer-to-peer lending behavior: A comparison of first-time

and

repeated

borrowing

requests.

Inf.

Manag.

53,

857–867.

https://doi.org/10.1016/j.im.2016.07.006 Chao, T., Wang, J., Sun, B., 2014. Research on Adverse Selection and Moral Hazard in the P2P Online Lending Platform. J. Financ. Econ. 29, 100–108. Chen, X., Zhou, L., Wan, D., 2016. Group social capital and lending outcomes in the financial credit market: An empirical study of online peer-to-peer lending. Electron. Commer. Res. Appl. 15, 1–13. https://doi.org/10.1016/j.elerap.2015.11.003

Cramer, J.S., 2004. Scoring bank loans that may go wrong: A case study. Stat. Neerl. 58, 365–380. https://doi.org/10.1111/j.1467-9574.2004.00127.x Davis, S., Albright, T., 2004. An investigation of the effect of Balanced Scorecard implementation of financial

performance.

Manag.

Account.

Res.

15,

135–153.

https://doi.org/10.1016/j.mar.2003.11.001 De, L., Brass, D.J., Lu, Y., Chen, D., 2015. Friendships in online peer-to-peer lending: Pipes, prisms, and

relational

herding.

MIS

Q.

Manag.

Inf.

Syst.

39,

729–742.

https://doi.org/10.25300/misq/2015/39.3.11 Desai, V.S., Crook, J.N., Overstreet, G.A., 1996. A comparison of neural networks and linear scoring models in the credit union environment. Eur. J. Oper. Res. 95, 24–37. https://doi.org/10.1016/0377-2217(95)00246-4 Dorfleitner, G., Priberny, C., Schuster, S., Stoiber, J., Weber, M., de Castro, I., Kammler, J., 2016. Description-text related soft information in peer-to-peer lending - Evidence from two leading European

platforms.

J.

Bank.

Financ.

64,

169–187.

https://doi.org/10.1016/j.jbankfin.2015.11.009 Eisenbeis, R.A., 1978. Problems in applying discriminant analysis in credit scoring models. J. Bank. Financ. 2, 205–219. https://doi.org/10.1016/0378-4266(78)90012-2 Finlay, S., 2011. Multiple classifier architectures and their application to credit risk assessment. Eur. J. Oper. Res. 210, 368–378. https://doi.org/10.1016/j.ejor.2010.09.029 Gao, Q., Lin, M., 2015. Lemon or Cherry? The Value of Texts in Debt Crowdfunding. SSRN Electron. J. https://doi.org/10.2139/ssrn.2446114 Ge, R., Feng, J., Gu, B., 2016. Borrower’s default and self-disclosure of social media information in P2P lending. Financ. Innov. 2, 30–39. https://doi.org/10.1186/s40854-016-0048-3 Gogar, T., Hubacek, O., Sedivy, J., 2016. Deep neural networks for web page information extraction, in:

IFIP

Advances in

Information and

Communication Technology.

pp.

154–163.

https://doi.org/10.1007/978-3-319-44944-9_14 Hájek, P., 2011. Municipal credit rating modelling by neural networks. Decis. Support Syst. 51, 108–118. https://doi.org/10.1016/j.dss.2010.11.033 Iyer, R., Khwaja, A.I., Luttmer, E.F.P., Shue, K., 2016. Screening peers softly: Inferring the quality of small borrowers. Manage. Sci. 62, 1554–1577. https://doi.org/10.1287/mnsc.2015.2181 Jiang, C., Wang, Z., Wang, R., Ding, Y., 2018. Loan default prediction by combining soft information extracted from descriptive text in online peer-to-peer lending. Ann. Oper. Res. 266, 511–529. https://doi.org/10.1007/s10479-017-2668-z Jiang, C., Wang, Z., Zhao, H., 2019. A prediction-driven mixture cure model and its application in credit scoring. Eur. J. Oper. Res. 277, 20–31. https://doi.org/10.1016/j.ejor.2019.01.072 Karlan,

D.S.,

2007.

Social

connections

and

group

banking.

Econ.

J.

117.

https://doi.org/10.1111/j.1468-0297.2007.02015.x Kou, F., Du, J., Lin, Z., Liang, M., Li, H., Shi, L., Yang, C., 2018. A semantic modeling method for social network short text based on spatial and temporal characteristics. J. Comput. Sci. 28, 281–293. https://doi.org/10.1016/j.jocs.2017.10.012 Lee, T.S., Chiu, C.C., Chou, Y.C., Lu, C.J., 2006. Mining the customer credit using classification and regression tree and multivariate adaptive regression splines. Comput. Stat. Data Anal. 50, 1113–1130. https://doi.org/10.1016/j.csda.2004.11.006 Lee, T.S., Chiu, C.C., Lu, C.J., Chen, I.F., 2002. Credit scoring using the hybrid neural discriminant

technique. Expert Syst. Appl. 23, 245–254. https://doi.org/10.1016/S0957-4174(02)00044-1 Lessmann, S., Baesens, B., Seow, H.V., Thomas, L.C., 2015. Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research. Eur. J. Oper. Res. 247, 124–136. https://doi.org/10.1016/j.ejor.2015.05.030 Lin, M., Prabhala, N.R., Viswanathan, S., 2013. Judging borrowers by the company they keep: Friendship networks and information asymmetry in online peer-to-peer lending. Manage. Sci. 59, 17–35. https://doi.org/10.1287/mnsc.1120.1560 Malhotra, R., Malhotra, D.K., 2003. Evaluating consumer loans using neural networks. Omega. 31, 83–96. https://doi.org/10.1016/S0305-0483(03)00016-1 Martens, M.L., Jennings, J.E., Jennings, P.D., 2007. Do the stories they tell get them the money they need? The role of entrepreneurial narratives in resource acquisition. Acad. Manag. J. 50, 1107–1132. https://doi.org/10.5465/AMJ.2007.27169488 Michels, J., 2012. Do unverifiable disclosures matter? Evidence from peer-to-peer lending. Account. Rev. 87, 1385–1413. https://doi.org/10.2308/accr-50159 Mukherjee, A., Nath, P., 2003. A model of trust in online relationship banking. Int. J. Bank Mark. 21, 5–15. https://doi.org/10.1108/02652320310457767 Peng, H.F., Ye, Y.G., 2011. Loan-pricing Model Based on Repayment Ability and Repayment Willingness. Chinese J. Manag. Sci. 19, 40–47. Perkins, S.J., Hendry, C., 2005. Ordering top pay: Interpreting the signals. J. Manag. Stud. 42, 1443–1468. https://doi.org/10.1111/j.1467-6486.2005.00550.x Pötzsch, S., Böhme, R., 2010. The role of soft information in trust building: Evidence from online social lending, in: Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial

Intelligence

and

Lecture

Notes

in

Bioinformatics).

pp.

381–395.

https://doi.org/10.1007/978-3-642-13869-0_28 Rao, D., Zhu, Y., Jiang, Z., Zhao, G., 2015. Generating rules with common knowledge: A framework for sentence information extraction, in: Proceedings - 2015 7th International Conference on Intelligent Human-Machine Systems and Cybernetics, IHMSC 2015. pp. 373–376. https://doi.org/10.1109/IHMSC.2015.113 Sowers, D.C., Durand, D., 1942. Risk Elements in Consumer Installment Financing. J. Mark. 6, 407. https://doi.org/10.2307/1246534 Spence, M., 1973. Job Market Signaling. Q. J. Econ. 87, 355–374. https://doi.org/10.2307/1882010 Standifird, S.S., 2001. Reputation and e-commerce: EBay auctions and the asymmetrical impact of positive

and

negative

ratings.

J.

Manage.

27,

279–295.

https://doi.org/10.1016/S0149-2063(01)00092-7 Sun, J., Li, H., 2008. Listed companies’ financial distress prediction based on weighted majority voting

combination

of

multiple

classifiers.

Expert

Syst.

Appl.

35,

818–827.

https://doi.org/10.1016/j.eswa.2007.07.045 Van den Bogaerd, M., Aerts, W., 2015. Does media reputation affect properties of accounts payable? Eur. Manag. J. 33, 19–29. https://doi.org/10.1016/j.emj.2014.05.002 Wang, H., Lin, H., 2015. An Empirical Study of Borrowing Description’s Influence on P2P Lending. J. Financ. Econ. 30, 77–85. Wang, J.-Y., Kuo, M.-F., Han, J.-C., Shih, C.-C., Chen, C.-H., Lee, P.-C., Tsai, R.T.-H., 2017. A Telecom-Domain Online Customer Service Assistant Based on Question Answering with Word Embedding and Intent Classification, in: Proceedings of the {IJCNLP} 2017, System

Demonstrations. pp. 17–20. Wang, W.M., Cheung, C.F., Lee, W.B., Kwok, S.K., 2008. Mining knowledge from natural language texts using fuzzy associated concept mapping. Inf. Process. Manag. 44, 1707–1719. https://doi.org/10.1016/j.ipm.2008.05.002 Wang, Y., Xu, W., 2018. Leveraging deep learning with LDA-based text analytics to detect automobile

insurance

fraud.

Decis.

Support

Syst.

105,

87–95.

https://doi.org/10.1016/j.dss.2017.11.001 West, D., 2000. Neural network credit scoring models. Comput. Oper. Res. 27, 1131–1152. https://doi.org/10.1016/S0305-0548(99)00149-5 Wicaksono, A.F., Myaeng, S.H., 2013. Toward advice mining: Conditional random fields for extracting advice-revealing text units, in: International Conference on Information and Knowledge

Management,

Proceedings.

pp.

2039–2048.

https://doi.org/10.1145/2505515.2505520 Wiginton, J.C., 1980. A Note on the Comparison of Logit and Discriminant Models of Consumer Credit Behavior. J. Financ. Quant. Anal. 15, 757–770. https://doi.org/10.2307/2330408 Yan, X., Guo, J., Lan, Y., Cheng, X., 2013. A biterm topic model for short texts, in: WWW 2013 Proceedings of the 22nd International Conference on World Wide Web. pp. 1445–1455. https://doi.org/10.1145/2488388.2488514 Young, T., Hazarika, D., Poria, S., Cambria, E., 2018. Recent trends in deep learning based natural language

processing

[Review

Article].

IEEE

Comput.

Intell.

Mag.

https://doi.org/10.1109/MCI.2018.2840738

Highlights 1. This paper examines whether the semantic information of loan description text can help to

predict the credit risk of different types of borrowers on the Chinese P2P platform. 2. We use the 5P credit evaluation theory and the word embedding model to extract semantic features of loan description text from five different dimensions. 3. The semantic features of textual soft information can significantly improve the predictability of credit evaluation models, and for the applicant who borrows for the first time, the promotion effect is more significant. The authors have declared that no conflict of interest exists.

Author Contribution Statement

Kun Liang:Conceptualization,Methodology,Investigation,Writing - Original Draft,Writing - Review & Editing,Supervision,Project administration,Funding acquisition. Jun He:Software,Validation,Formal analysis,Resources,Data Curation,Visualization.