Behaviour-based short-term invoice probability of default evaluation

Behaviour-based short-term invoice probability of default evaluation

ARTICLE IN PRESS JID: EOR [m5G;September 5, 2016;13:24] European Journal of Operational Research 0 0 0 (2016) 1–10 Contents lists available at Sci...

1MB Sizes 1 Downloads 71 Views

ARTICLE IN PRESS

JID: EOR

[m5G;September 5, 2016;13:24]

European Journal of Operational Research 0 0 0 (2016) 1–10

Contents lists available at ScienceDirect

European Journal of Operational Research journal homepage: www.elsevier.com/locate/ejor

Interfaces with Other Disciplines

Behaviour-based short-term invoice probability of default evaluation Igor Perko Faculty of Economics and Business Slovenia, University of Maribor, Razlagova 14 2000, Maribor, Slovenia

a r t i c l e

i n f o

Article history: Received 24 March 2016 Accepted 16 August 2016 Available online xxx Keywords: Probability of default Data sharing Behavioural analytics Credit risk Strategic default

a b s t r a c t In this paper, the effect of behavioural analytics on short-term default predictions at the invoice level is addressed by answering a question that slightly diverges from the traditional probability of default definition: ‘What is the probability that this invoice will be paid within the next 30 days?’ Resultantly improving short-term liquidity planning accuracy and supporting financial management in companies. To provide a valid answer to the research question, a set of issues needs to be resolved, including identifying an appropriate data set, increasing the data predictive power, and creating and testing predictive models. Since the appropriate data set is not yet presented, we primarily focus on the first two issues: identifying appropriate data and raising its predictive power. In this paper, we propose to build predictive models upon a new data source from multiple companies, acquired by business partners’ data sharing concept. Furthermore, we upgrade these data with behavioural analysis to test the assumption that the probability of default depends not only on payment capability but also on payment preparedness. The predictive power of shared invoice data and the effects of behavioural analysis are tested in a two-phase experiment: first, basic shared data are used to predict short-term invoice defaults, and in the second phase, the behavioural analysis results are included in the dataset. Lastly, the predictive models’ test results are compared. Both results are positive: the already high accuracy of models, build upon basic data is significantly upgraded in models, using the behaviour analysis extended data set. © 2016 Elsevier B.V. All rights reserved.

1. Introduction The commonly used metric for determining partners’ financial state, the probability of default (PD), is focused on assessing the company’s long-term ability to comply with financial obligations (Allen, 1981; Blochlinger, 2012; Doumpos, Kosmidou, Baourakis, & Zopounidis, 2002; Kim & Sohn, 2010). PD and related metrics are used by multiple stakeholders: financial institutions in the role of creditors and investors, owners, and business partners. Most of them are involved in long-term business relationships and are using PD as a long-time variable. Probability of default on invoice level is consequently derived from the probability of default on a company level and lack of the research efforts focusing on invoice level PD prediction is evidenced. Even though diverse approaches have been taken to improve the accuracy of the predictions (Hu & Ansell, 2007; Kim & Sohn, 2010; Premachandra, Bhabra, & Sueyoshi, 2009), the long-term PD accuracy is relatively low, usually with prediction accuracy around 80%, mostly due to the long prediction horizon and to the fact that it is evaluated on highly aggregated data, revealing only the company-level results.

E-mail address: [email protected]

In the day-to-day business environment, short-term payment predictions are of existential importance (Leow & Crook, 2014). The company-level PD is insufficient to support operational-level management decisions that involve selecting the best steps for risk mitigation and payment optimisation. The same is true for tacticallevel liquidity planning, where the probability of payments must be assessed on an invoice level. The related scientific literature does not provide adequate research results in this field. We will contribute in the OR scope of research, by proposing a new combination of data analysis processes, and upgrade financial management scientific literature by providing a model for short-term payment predictions. If the focus is established from observing the long-term company survivability to the short-term business event prediction, we should rephrase the probability of default question from ‘Will the company survive in the long term?’ to ‘Will the invoice be paid within a time limit?’ This question requires a much higher precision level than an overall company PD can deliver, since it focuses on the ability and preparedness of the company to pay an exact instance in the financial traffic. The invoice-level PD enables better selection of operational-level risk mitigation strategies than a company-level PD. It also provides a significant improvement in financial management decision-making support.

http://dx.doi.org/10.1016/j.ejor.2016.08.039 0377-2217/© 2016 Elsevier B.V. All rights reserved.

Please cite this article as: I. Perko, Behaviour-based short-term invoice probability of default evaluation, European Journal of Operational Research (2016), http://dx.doi.org/10.1016/j.ejor.2016.08.039

JID: EOR 2

ARTICLE IN PRESS

[m5G;September 5, 2016;13:24]

I. Perko / European Journal of Operational Research 000 (2016) 1–10

In the proposed solution, an innovative combination of OR approaches, with special attention to data sources discovery and data upgrading methodologies is proposed. Sharing invoice payment data provides insight into the high-granularity operational data; behavioural patterns analysis provides additional knowledge about commonly used strategies; and state-of-the-art prediction algorithms provide accurate results on a more detailed level than the state-of-the-art PD-related research. The first issue that requires a solution is the acquisition of a valid data source. Currently, two major data sources are used to assess PD: The first is the data reported by the evaluated companies, and the second is personal experience, obtained by doing business with the company. Both have limitations. 1. Financial reports are highly synthesised, and potentially biased: We can assume that a company’s management will try to exaggerate its own successes and underplay or even hide failures (Perko & Mlinaric, 2016; Wang & Zhang, 2009). Additionally, these data are usually synthesised on a company level and therefore lack the granularity level required for assessing PD on the invoice level. 2. Personal experience usually represents only a small fragment of the overall company business collaborative relationships; therefore, predictions regarding future behaviour are limited to this narrow scope of information. We can only see how our partners are behaving towards us, while for correct behaviour assessment, a full picture is required. In this paper, sharing business partner invoice payment data is explored for capacities as a valid data source for invoice PD prediction. Sharing business data with customers and partners in supply chains is common practice; examples include inventory listings, and operational and logistic data. Customer data are also shared between financial institutions (Jappelli & Pagano, 2002). We expect that invoice payment-related data gathered from multiple sources, at the appropriate granularity level, can provide a required holistic picture of a company performance and reduces the risk of biased reporting to an acceptable level. In short-term invoice PD, particular attention is devoted to the companies already in situations of stress, where their preparedness of paying an invoice must also be taken into account. We expect that financial officers of companies in distress generally develop similar strategies about which invoices to pay regularly and which not to pay. When behavioural patterns of multiple companies are observed, these patterns can be identified and used to predict the forthcoming behaviour of other financial officers in similar circumstances. Behavioural analysis is used to enhance the data reasoning capabilities. First, we identify common behavioural patterns and rules for assessing affinity towards them on multiple levels; then we assess the company affinity towards these pattern rules. Last, for every invoice, we enhance the basic invoice data with the appropriate affinity data. To test the proposed solution, an experiment is executed based on invoice data shared by large Slovenian companies. Short-term PD is calculated using multiple data mining algorithms in two phases. First, only basic shared data are used, and secondly, behaviour pattern affinity data are added. The created models are then tested and the results compared. A high prediction accuracy due to short-term prediction span and the low granularity of the data is expected in the first run, and an accuracy improvement due to better explanation is expected in the second phase of the experiment. The paper is organised as follows: first, the literature background is elaborated, and then the proposed invoice PD data model is presented, focusing on sharing mechanisms and the interpretation of behaviour strategies. The model test results are

presented and thoroughly analysed. Finally, implications for financial management and OR are discussed, and potential future research projects are proposed. 2. Background The probability of default is among the most important forms of information in financial institutions’ credit management processes. In a formal definition (Basel, 2006, p. 96) default occurs when the obligor is unlikely to pay its credit obligations or when the obligor is past due more than 90 days on any material credit obligation. PD is the fundamental measure in the regulatory frameworks for financial institutions (BIS, 2011) and thereby has been heavily researched (Crouhy, Galai, & Mark, 20 0 0; Doumpos et al., 2002; Li & Miu, 2010; Tong, Mues, & Thomas, 2012). Its shortcomings are also strongly disputed. Finlay (2009), for instance, addresses the issues of incorrect problem specification, while the lack of time definition in PD is addressed by invoking time-varying covariates (Orth, 2013) and past-due time thresholds (Harris, 2013). A valid solution systematically resolving the PD time indistinctness has not been found in the literature. Research efforts in supporting the needs of corporate financial officers are not well documented comparing with the finance institutions’ related research. Barro and Basso (2010) define counterparty risk and determine the impact of important player defaults on the market. The scarcity of corporate-focused research could be the consequence of special needs, focusing on short-term liquidity, or of limited data sets, prohibiting the employment of successful data mining. Using this paper, we intend to help to close this gap. The predictive assessment quality depends on the research question definition, predictive algorithm selection, the data quality, and the process management. The PD-related operational research is largely focused on upgrading predictive models and algorithms using variations and combinations of support vector machines (SVM), neural networks, genetic algorithms, logistic regression, and other methods including hybrid solutions (Harris, 2013; Kim & Sohn, 2010; Yao, Crook, & Andreeva, 2015). Nonetheless, less than the required focus is set to understanding the drivers of a default, achieving the required data predictive power, and formulation of a valid research question to address specific PD related issues. Is the probability of default the correct predictive research question for assessing short-term debts? Finlay, (2010) discusses the question by determining the objectives in the credit relationship. Dionne and Laajimi (2012) imply default thresholds and elaborate liquidity shortage and strategic default. Ju, Jeon, & Sohn, (2015) research the effects of behavioural technology in assessing the effects of economic environment in stress situations. Nevertheless, a solution capable of resolving the need for short-term default assessment has not yet been provided, according to our insight. Providing the required data quality level for successful predictive analysis is the most time consuming part of the predictive analytic process (Davidson & Tayi, 2009), resulting in only limited records to the scientific scope. The main focus of data quality related research is set to resolving data issues as completeness, validity, consistency, timeliness and accuracy (Zhang, Zhang, & Yang, 2003). In this paper, we extend the data quality formulation by examining the importance of data relevance and data explanatory power for the accuracy of the predictive models. In a shorter term: data predictive power. Data quality has a significant effect on prediction accuracy (Coculescu, Geman, & Jeanblanc, 2008; Florez-Lopez, 2010; Wolter & Roesch, 2014). Piramuthu (2006) point outs the significance of data understanding, correct interpretation, and pre-processing effects on prediction quality. After examining the PD-related literature, we can conclude that most of the models use company-level

Please cite this article as: I. Perko, Behaviour-based short-term invoice probability of default evaluation, European Journal of Operational Research (2016), http://dx.doi.org/10.1016/j.ejor.2016.08.039

JID: EOR

ARTICLE IN PRESS

[m5G;September 5, 2016;13:24]

I. Perko / European Journal of Operational Research 000 (2016) 1–10

3

Fig. 1. Company overview (Perko & Mlinaric, 2016).

data that is provided systematically, and that data of lower granularity – for instance, business event-level data – are less accessible to the researchers. Payment utilisation ranging from regular payments, late payments, and strategic defaults to company defaults represents an important issue in corporate finance. Uzzi and Gillespie (2002) discuss late payments, relations, and penalties, while Earle and Sabirianova (2002) examine the business environment effect. The effects of strategic actions, especially strategic defaults, on corporate debt values are elaborated by Davydenko and Strebulaev (2007). A model explicitly predicting late payments or payment default is not found in the literature. We intend to address this gap using shared invoice payment data enhanced with behavioural analysis and to create and test invoice PD evaluation prediction models. The lack of quality data for successful prediction is one of the fundamental issues in designing successful prediction models. One of the promising concepts is transfer learning. Pan and Yang (2010) elaborate the state of the art and potentials of using the knowledge and patterns, created in a well-researched domain/space/instance to a domain (potentially a related one), where data does not support creation of accurate models. The concept of transfer learning is if also proposed by Zhu, He, and Jiang (2015) to predict the customers behaviours. Pagano and Jappelli (1993, 2002) research the effects of data sharing on credit risk mitigation among financial institutions. The data sharing concept between companies, used as a data source in this paper, is elaborated by Perko and Mlinaric (2016). They propose a concept where companies share business data describing their partners’ behaviour. The three major roles in the sharing process are identified: the sharing companies, the agency, and the observed companies the data are shared upon. Their roles, tasks, and involved processes are elaborated in the sense of providing the optimal utilisation for all of the involved parties. Perko, Primec, and Horvat (2015) further discuss business partners data sharing from multiple viewpoints: feasibility issues, business value added and risks, ethical, and law aspects. By using system dynamics, they conclude, that business data sharing can have short term negative impact to some of the companies, but with the transparency it brings

to the business ecosystem, the viability potential for the most of its participants is increased. The result of the sharing process, provided by Perko and Mlinaric (2016) is an overview of the observed company open debts’ current state and recent changes, including the data visualisation, as depicted in Fig. 1. The provided research did not deliver predictions of their future behaviour. We try to accomplish this task in the presented paper. Behavioural analysis has a significant effect on the short-term PD assessment accuracy, as discussed by Kennedy, Mac Namee, Delany, O’Sullivan, and Watson (2013). Cao (2010)„ proposes to introduce behavioural informatics, including behaviour representation, behavioural data construction, behaviour impact analysis, behaviour pattern analysis, behaviour simulation, behaviour presentation, and behaviour use. Besser (1999) analyses behaviour in small and medium-sized enterprises (SMEs), proving that they are oriented to support their local environment and arguing the claim with the closed support loop. Finally Ju et al. (2015) invoke management behaviour in stress situations as a credit risk factor. Nonetheless, analysis of companies’ payment-related behaviours and their effects on PD is not presented in the literature (Fig. 2). To elaborate decisions that have negative social or environmental consequences, Wood, Noseworthy, and Colwell (2013) suggest that the degree to which managers make high-risk tradeoffs is highly influenced by how they mentally represent the decision context. The authors find that managers are more likely to make seemingly unethical trade-offs when psychological distance is high (rather than low) and when they are forced to choose between competing alternatives. Chen, Koek, and Tong (2013) examine the effects of inventory quantities on the selection of payment schemes and discover that behaviour is not oriented towards profit maximisation, but can be better explained by prospective accounting theory. Kautonen, van Gelderen, and Tornikoski (2013) propose applying the theory of planned behaviour to predict entrepreneurial behaviour. They discuss attitude, perceived behavioural control, and subjective norms as significant predictors of entrepreneurial intention, and perceived behavioural control as a significant predictor of subsequent behaviour. Nevertheless, since

Please cite this article as: I. Perko, Behaviour-based short-term invoice probability of default evaluation, European Journal of Operational Research (2016), http://dx.doi.org/10.1016/j.ejor.2016.08.039

JID: EOR 4

ARTICLE IN PRESS

[m5G;September 5, 2016;13:24]

I. Perko / European Journal of Operational Research 000 (2016) 1–10

Fig. 2. The behaviour-based prediction process.

the previous research did not provide a valid behavioural data source, we suggest utilizing business data sharing for that purpose in the scope of predicting PD.





3. Methodology In the following sections, we evaluate the short-term probability of an invoice using historical payment data, which is additionally described with the behavioural analysis results. Thereby, we formulate the research question: ‘What is the probability that this invoice is going to be paid within the next 30 days?’ By answering this question, we provide support for financial management at the operational level, helping to decide to accept or reject a business proposal, to select appropriate pre-emptive collateral instruments, to apply appropriate credit risk mitigation strategies, and to plan liquidity on the operational and tactical levels. We invoke the generally accepted PD and further elaborate some of its properties:



Scope. Generally, PD is a company-level composed measure, including parameters gathered on multiple levels (Crouhy et al., 20 0 0). In the provided research, the focus is set to predict an elementary event, (non)payment of a single invoice. Time frame. The general PD time frame is not specifically defined, but since the evaluation involves slowly changing data, the evaluation intervals can be up to one year (Altman & Rijken, 2004). It is applied to monitor long-term financial investments, and we argue that it is a long-term variable. While the predictions whether the invoice payment will (or will not) occur within 30 days, the data upon which PD is evaluated are expected to be modified in near-real time, posting new requirements on predictive models management process. Usability. Company-level PD is primarily applied to assess credit risks in financial instruments and derivatives. Because of the scope of a single invoice PD and its time frame, it can be used in operational cash flow management, liquidity planning, and credit risk management. When summarised, invoice PD can provide company-level PD-like usage potentials. Indirectly,

Please cite this article as: I. Perko, Behaviour-based short-term invoice probability of default evaluation, European Journal of Operational Research (2016), http://dx.doi.org/10.1016/j.ejor.2016.08.039

JID: EOR

ARTICLE IN PRESS

[m5G;September 5, 2016;13:24]

I. Perko / European Journal of Operational Research 000 (2016) 1–10

it can contribute to better payment behaviour of non-yielding companies. To resolve the research task, several stages of analysis are addressed: •





Acquiring relevant data source. To predict future conduct, the most influential source is previous behaviour in similar positions. To predict invoice PD, therefore, previous payment behaviour data is of major importance. A company will be unlikely to report its disreputable past behaviour correctly; therefore, other data sources must be obtained. Identifying, defining, and measuring behavioural patterns. It is much easier to predict future behaviour if limited numbers of behaviour patterns are identified, and reasoning is applied when the company financial officer uses these patterns. We propose a three-step behaviour pattern evaluation process. Predicting behaviour in an instance. Based on existing behavioural patterns, predictive models must be created that can be applied to the data describing a new payment instance.

To resolve the research issues, the following research steps are taken: A valid research data set, describing payment history, obtained with the sharing mechanism is used, and a behavioural analysis model is established. Using an experiment, the usability of shared payment data and behavioural data is used to create, test, and compare several predictive models. 3.1. Data sources To acquire a relevant data source, we must satisfy multiple objectives: Key data must be at least on the same level of detail as the prediction, and the data must be unfiltered and with bias reduced to an acceptable level (Pagano & Jappelli, 1993; Perko & Mlinaric, 2016). To meet the first objective, a detailed data set on payment history is required. The self-reporting of payment data does not fulfil the second objective, since it is predisposed to distortion and filtering. Nevertheless, multiple options for acquiring unbiased data are available: payment data stored in financial organisations, data stored in Enterprise Resource Planning (ERP) cloud solutions, and data gathered by a payment data sharing mechanism (Perko & Mlinaric, 2016). The latter is briefly explained in this chapter. The payment data sharing mechanism is based on daily sharing businesses’ non-paid issued invoice data from the date an invoice was issued to the day it was compensated – thereby the date of compensation can be extrapolated. Companies share data on unpaid invoices they have issued to their business partners. By doing this, data sharing companies reveal their partners’ performance and at the same time do not expose data on their performance. In return, the information originating from multiple sources, which can be used in making business decisions, is delivered to a data sharing company. The mechanism is based on a presumption that the value of complete data on business partner information is higher than the cost of reporting the data (ibid.). In the payment data sharing mechanism, only key invoice data are shared: debtors (the sharing company), creditor’s (the partner) ID, the invoice ID, the invoice date, the due date, and the claimed amount. These data are usually already stored in enterprise resource planning (ERP) systems, minimising the operational sharing costs. If the data are shared regularly, the information is sufficiently detailed to reveal the partners’ payment behaviour patterns, but does not represent disclosure risks for the sharing company. In the payment data sharing mechanism, three actors are involved: the data sharing company, the agency, and the business partners. The sharing company’s goal is to reduce partner-related

5

risks, so it gains value added by acquiring a more complete picture of business partners’ payment behaviour. The data sharing agencies’ task is to process and share payment behaviour data. The business partners are passive, with their reputation dependent on agency reports. Thus, their interests must be protected by informing them of data reported on them and providing them with an opportunity to respond to the reports. The agency acts in the name of the sharing companies. Its aim is to supply services that are beyond the reach of a single sharing company, to provide information with the highest value added, and to minimise the data sharing-related costs and hazards by using appropriate security measures. Major agency-related risk is connected with the potential misuse of shared information. Therefore, appropriate supervision mechanisms must be applied at all levels. Perko and Mlinaric (2016) applied the sharing model in a realworld experiment. Conducted in 2012 and 2013, the experiment includes data gathered on a daily basis. The last data snapshot contains invoice data on 4275 companies, with 33,635 invoices, for a total value of 58,512,870.49 euros. In the experiment, they addressed two data quality issues: the level of details observed should be at least at the level of issue analysed, and the data should be gathered from multiple independent sources. They also provided support for their second hypothesis, that the completeness of data is not required for identifying potential issues, but that these can be derived from a relatively small sample. The resulting data set, upgraded with data describing behaviour patterns, is used in an invoice probability of default evaluation experiment, elaborated in this paper. 3.2. Behaviour analysis By introducing the behaviour pattern analysis to the PD prediction, we test the assumption that prediction results will be significantly improved over direct prediction. Behavioural patterns are used to limit the number of potential behavioural options on the one side, while providing a valid reasoning mechanism to reason about the affinity towards using selected behaviour patterns and its impacts on the PD prediction. In other words, we argue that if we can explain why somebody acts in a certain way and describe the typical consequences of such actions in the past, we can better anticipate future behaviour and validate its potential outcomes. We propose a three-step behavioural analysis. In the first step, the behavioural pattern framework is designed. This involves the identification of patterns and the definition of validation rules for evaluating affinity towards every specific pattern on multiple levels. It results in populating two structures:

Pi = [ p]i,l

(1)

Ai,l = f (R1..n )i,l

(2)

where P in (1) is a finite set (i: 1..n) of predefined patterns (p) on four basic levels (l). Levels define to whom a pattern applies: to all companies, to a specific group of companies – for instance, a branch or local group – to a single company, or if it can be applied on an invoice level. The Affinity (A) in (2) is a function based on rule sets (R) that are identified for every pattern on every level (where applicable), and result in the affinity towards using a certain pattern in a given situation. Since the rules for identifying affinity differ significantly relative to the patterns, the affinity evaluation functions are manually coded. For instance, the affinity towards the pattern ‘Non payment of old debts’ is tested by using a rule set containing one single metric, ‘Executed old payments ratio’, that compares the paid and unpaid amounts in invoices that should have been paid at least 90 days ago. If all the old debts are paid, it reruns 0, if all are unpaid it returns 1. This rule set is

Please cite this article as: I. Perko, Behaviour-based short-term invoice probability of default evaluation, European Journal of Operational Research (2016), http://dx.doi.org/10.1016/j.ejor.2016.08.039

JID: EOR 6

ARTICLE IN PRESS

[m5G;September 5, 2016;13:24]

I. Perko / European Journal of Operational Research 000 (2016) 1–10

evaluated on the following levels: all (all companies to all companies, debtor group to all companies, debtor to all companies), group (debtor group to creditor group, debtor group to creditor, debtor to creditor group), and company level (debtor to creditor) for all of the available combinations in the stored data. Thereby if company does not pay any old debts to a certain company, the company to company pattern is 1. If a company pays 50% of old debts to a certain branch, then the company to branch is 0.5. In the second step, the affinity towards behavioural patterns is repeatedly evaluated using the last known data. The data expressing behavioural patterns (BP) in (3) is stored a four-dimensional cube.

BP = [a, ∂ a]i, j,l,k

(3)

The cube dimensions are i: pattern, j: company, k: counterparty, and l: level. In each cell, two normalised values are stored: a: affinity towards selected BP and ∂ a: change in the last recorded interval. Both absolute value and change have potential value added for PD prediction accuracy (Cao, 2010; Webb & Sheeran, 2006). The need for storing historical values in the affinity cube is substituted by introducing the ∂ a, presuming that the last changes in the affinity towards BP have the largest influence on future behaviour.

BPn (i, j ) = [a, ∂ a]l,k (i, j )

(4)

invoicen => invoicen ; BPn (i, j )

(5)

In the third step, for every invoice (n), the appropriate array of affinities is selected based on the i: company and j: counterparty of the invoice as defined in (4). Then, existing invoice data are enriched with this array of affinities and prepared for the prediction process (5). For instance, if a company affinity towards non-paying old debts towards all companies is at 90% and decreasing, and towards a company in this invoice is at 50% and is stable, this is added to the invoice information. We expect that by providing additional BP affinity information, the relations to other (non)payments are more easily elaborated and predictions on (non)payments can be provided with higher accuracy. The result of the behaviour analysis is formalized knowledge on debtor behaviour in diverse situations in the form of a variables array. These variables can simply be added to basic data of a single invoice and can thereby be easily used in the predictive analytics process. In the case, presented in this paper the selection of behavioural patterns and design of rules, validating affinity is designed based upon existing user experience and data related limitations and is coded manually. This approach shall be upgraded using dynamic intelligent pattern recognition, and affinity assessment methods – especially during the system optimisation process. 3.3. Predictive analysis Predictive analytic models are created and tested on limited time interval data subset. Data subset includes invoices that were unpaid at the beginning of the interval and were compensated – or not during this interval. The dependent variable Paid has two values: 0 and 1. Invoices that were compensated during this period are marked with 1, and invoices that remained unpaid are marked with 0. Independent variables contain basic invoice data, and data describing behavioural properties. Reformulation of the complex behavioural analysis to a single row with independent variables and one dependent variable enables us to use standard classification algorithms to predict and test the indented variable Paid.

3.4. The experiment To test the behavioural data effects on the prediction accuracy, an experiment is conducted in which the probability of default is predicted twice: the first (control) set of predictive models is built on basic invoice data; the second set predictive models uses enhanced invoice data, described with additional attributes explaining BP affinity. In the experiment, data on 71,554 invoices are included, gathered in a time span of over one year. The experiment begins with behavioural analysis. Behavioural variables are calculated for all the Creditor-debtor combinations on multiple levels. This results in 87,201 single behavioural variables. In the next step, invoices for prediction, using the last recorded 30 days time interval are, selected for the prediction models building and testing. A total of 22,919 invoices are selected. 15,520 of these were paid in the last 30 days, and 7339 remained un-paid. In the experiment, two data sets are used to predict PD: •



Basic data set, using as the control set. In the control data set, the basic invoice data properties are stored: creditor’s international tax ID, due date, the claimed amount, DaysNotPaid and the dependent variable Paid. Enhanced - behaviour-enriched data set is a copy of basic data set, described with 46 additional behavioural attributes, ranging from ‘Incorrectly issued invoice’ to ‘Debtor is in default’, observing properties at the group, debtor, creditor, and invoice levels. Some of these patterns are measured on only one level, while others, like ‘Non-payment of old debts’, are measured at multiple levels.

Predictive models are built (and tested) twice: the first (control) set of predictive models is built on basic invoice data; the second set predictive models uses enhanced invoice data. For building and testing the models, a repetitive random sampling method is used 10 times, with 60% of invoices randomly sampled for models building and the residual 40% used for the testing purposes. The exception is SVM model, where, due to the prediction modelling tool issues, a single repetition is executed. Multiple data-mining classification algorithms are used to create predictive models, including: neural networks (NN), support vector machines (SVM), classification tree (CT), logistic regression (LR), and CN2 (Clark & Niblett, 1989). The same predictive algorithms are used in the first (control) set and in the second set. The Orange (Demsar et al., 2013) data mining tool is used to perform predictive analysis processes. The invoice PD prediction models are tested using standard testing procedures, testing variables, and randomly selected data. Models’ test results from the first (control) set are compared with test results based on enhanced data. 4. Results In the experiment, we examine two theories: (1) Short-term invoice PD can be predicted using shared invoice payment data; (2) Behaviour analysis can improve invoice PD prediction quality. Additionally, we are interested in which attributes revealed by the behaviour analysis have the most significant impact on the invoice probability of default. The control set provides test results above our expectations by all prediction models, as depicted in Table 1, with the prediction target set to 0.

Please cite this article as: I. Perko, Behaviour-based short-term invoice probability of default evaluation, European Journal of Operational Research (2016), http://dx.doi.org/10.1016/j.ejor.2016.08.039

ARTICLE IN PRESS

JID: EOR

[m5G;September 5, 2016;13:24]

I. Perko / European Journal of Operational Research 000 (2016) 1–10

7

Table 1 Basic data model test results. Model

CA

Sens

Spec

AUC

IS

F1

Prec

Recall

Brier

MCC

CT CN2 SVM LR NN

80.27 81.76 80.08 89.80 92.41

86.74 87.79 87.52 93.75 96.64

66.68 69.12 64.46 81.50 83.53

85.97 87.17 86.53 94.89 96.13

34.60 34.57 32.45 54.82 64.86

85.62 86.70 83.78 92.56 94.52

84.53 85.64 87.52 91.40 92.49

86.74 87.79 85.61 93.75 96.64

27.29 26.76 28.52 15.72 12.10

54.24 57.74% 53.42 76.39 82.41

Table 2. Behaviour-enriched data model test results. Model

CA

Sens

Spec

AUC

IS

F1

Prec

Recall

Brier

MCC

CT CN2 SVM LR NN

85.87 84.14 83.38 91.34 95.25

90.14 94.43 89.86 94.63 97.34

76.91 62.53 69.77 84.45 90.88

90.55 91.45 89.12 95.97 98.49

53.14 47.41 40.50 59.85 75.38

89.63 88.97 86.18 93.67 96.53

89.12 84.10 89.86 92.74 95.73

90.14 94.43 87.98 94.63 97.34

21.58 22.47 25.09 13.75 7.29

67.48 62.40 61.21 80.01 89.07

We can see that classification accuracy (CA) ranges from 80% achieved by CT and SVM, 81.8% by the CN2 model, and 89.8% by LR, to the notable 92.4% achieved by the NN model. With respect to sensitivity, (assessing the true positive rate results) indicate that all of the models succeed in predicting positive results; again, the lowest-performing CT provides an 86.7% success rate. The models are generally less successful with respect to specificity, (assessing the proportion of detected negative examples). Among all negative examples or the type 2 error; the CT, CN2, and SVN models’ results are below 70%, while LR and NN predict negative occurrences with more than an 80% success rate. The area under the receiver operating characteristic (ROC) curve (AUC) ranges from 86% to 96.1%, achieved by the NN in the respective categories. The information score (IS) (Kononenko & Bratko, 1991), excluding the influence of class probabilities, forms a group of relatively low-performing models: CT, CN2, and SVM, all residing below 35%. The LR model achieved 54.8%, and the NN clearly outperformed the residual models with a score of 64.86%. The measures of precision, recall, and weighted harmonic mean of precision and recall (F1) (Powers, 2011), all measuring the positive examples, are well above 80%, with NN reaching 94.52% in F1. The Matthews correlation coefficient (MCC) (Matthews, 1975), ranging from −100% to 100%, delivers results from 53.4% to 81%. The Brier score (Brier, 1950), measuring the predicted probability assigned to the possible outcomes for invoice PD and the actual outcome includes the predictions level of confidence in its score it measures the square distance between the accuracy confidence and the actual outcome. It is a negative oriented measure: if all predictions have confidence and accuracy of 100% it returns 0, if all predictions have confidence of 100%, but prediction accuracy is 0% it is 1. In our experiment, the models are separated into two distinct groups: The CT, CN2, and SVM range between 28.5% and 26.7%, the LR achieves 15.72%, while the NN outper-

forms the residual models with 12.1%, meaning that the average prediction confidence of successful predictions is higher than 87%. At the first glance, we can observe that all of the basic data models used in the experiment perform beyond expectations, and that the NN algorithm clearly outperforms the residual models, followed by the non-linear LR model. Based on the basic data model testing results, two questions arise: Can the results be upgraded, and can the divergence between the models be reduced by introducing new behaviourrelated data? The data in Tables 2 and 3 positively answer the first question for all of the newly generated models and, surprisingly, manifest significant improvement even in the previously best-rated NN model. We can observe CA improvement in all of the models, the most significant being in the CT model with 5.6% rise, reaching the 85.87%. Importantly CA of the already best performing model NN is at 95.25%, which represents an improvement of 2.84%. It is remarkable to observe such high rises, considering the high accuracy of the basic data models. Since in the basic data models sensitivity outperformed specificity, we expected a higher rise in specificity than in sensitivity: in all models, except the CN2 model, this is true: LR and NN sensitivity improves under 0.90%, while specificity improvement is 2.95% for the LR and exceptionally 7.35% for the NN, reaching 90.88%. Models SVM and CT gained 2.34% and 3.4% in sensitivity, and 5.31% and 10.23% in specificity, where the based data CT model was relatively low: it reached only 66.68%. We did not expect a drop of 6.5% in specificity for the CN2 model, downgrading to merely 62.53%. The Brier results are particularly important, since they consider the confidence of the models’ predictions. We can observe dramatic improvement (negative changes) in all of the models’ confidence, with the NN model achieving 7.29%, indicating that the average confidence of correct predictions reaches well over 90%.

Table 3. Difference between the basic and Behaviour-enriched data model test results. Delta

CA

Sens

Spec

AUC

IS

F1

Prec

Recall

Brier

MCC

CT CN2 SVM LR NN

5.60 2.38 3.30 1.54 2.84

3.40 6.64 2.34 0.88 0.70

10.23 −6.59 5.31 2.95 7.35

4.58 4.28 2.59 1.08 2.36

18.54 12.84 8.05 5.03 10.52

4.01 2.27 2.40 1.11 2.01

4.59 −1.54 2.34 1.34 3.24%

3.40 6.64 2.37 0.88 0.70

−5.71 −4.29% −3.43 −1.97 −4.81

13.24 4.66 7.79 3.62 6.66

Please cite this article as: I. Perko, Behaviour-based short-term invoice probability of default evaluation, European Journal of Operational Research (2016), http://dx.doi.org/10.1016/j.ejor.2016.08.039

ARTICLE IN PRESS

JID: EOR 8

[m5G;September 5, 2016;13:24]

I. Perko / European Journal of Operational Research 000 (2016) 1–10

Fig. 3. Nomogram logistic regression: most important attributes.

Table 4. NN confusion matrix with behaviour-enriched data.

0 1 Total

0

1

Total

97.34% 9.12% 37,876

2.66% 90.88% 17,124

37,250 17,750 55,0 0 0

Note: Columns represent predictions, row represent true classes.

The NN model’s results of the behaviour-enriched data can be assessed in Table 4, where columns represent predictions and rows represent true classes. We can see that of the 55,0 0 0 invoices in the test sample1 , only 2.66% of not actually paid invoices and 9.12% of actually paid invoices are predicted incorrectly. Since the NN is a black box model2 , we analysed attribute importance in the LR model, which provides the second-best test results. Attributes with the highest impact (DueDate, D_DaysNotPaid, and Debtor) are, as expected, provided in the basic data set. In the selected case, marked with blue dots, the DueDate provides 61 points (of a potential 100), D_DaysNotPaid 74 points (of a potential 90), and debtor identification 57 points (of a potential 85), as depicted in Fig. 3. 1 The number of the invoices is large due to the 10 times repetition of the randomly selected invoices. 2 In our daily lives we are confronted at every turn with systems whose internal mechanisms are not fully open to inspection, and which must be treated by the methods appropriate to the Black Box (Ashby, 1956).

The BP have a significant effect on the prediction. The LR model identifies behaviour 5_2 (due invoices reduction) at the company-company level, behaviour 0_2 (amount of new debts) at the company-company level, and again 5_2 at the general debtor level as the three important attributes potentially contributing between 40 and 43 points, respectively; in the selected case, approximately 30, 30 and 20 points are reached. The R_1 (existence of old debts), valued at 0, provides all 37 points to the total score. The debtor-creditor behaviour 5_3 (paid invoices reduction) could contribute 34 points, but in the selected case, 16 points are added. In the LR model, an additional 30 attributes are added to the total score, but they are less significant, with a total potential score under 200 points. In the selected example, the invoice gathers 500 of the available 550 points, resulting in an 83% probability of the invoice being paid in the next 30 days. 5. Conclusions Both research questions, involving the usability of shared payment data for short-term invoice PD prediction as well as the positive effects of behavioural analysis on PD accuracy are positively answered. As evidenced in the research results, the evaluation of debtor alignment with certain behavioural patterns, or even payment-related strategies, has a positive effect on Invoice PD accuracy. The research process is initiated by the shortcomings of the long-term company-level PD, for operational and tactical corporate finance management, where decisions on an invoice level are

Please cite this article as: I. Perko, Behaviour-based short-term invoice probability of default evaluation, European Journal of Operational Research (2016), http://dx.doi.org/10.1016/j.ejor.2016.08.039

JID: EOR

ARTICLE IN PRESS

[m5G;September 5, 2016;13:24]

I. Perko / European Journal of Operational Research 000 (2016) 1–10

made. To provide better usage capabilities for the financial manager, we further define the PD metric. We propose to predict default at the invoice level and focus the time bracket of the next 30 days. We expect these results to better accommodate the business financial officers’ short-term requirements. To provide unbiased and multi-sourced data at the required granularity, several options are discussed, especially focusing on the invoice data-sharing concept. In the data sharing concept, multiple companies regularly share their data on invoices, (un)paid by their business partners. We argue that in short-term PD, not only the capacity to pay, but also the readiness to pay plays an important role. To better express the readiness to pay, we introduce a model of the behavioural pattern analysis. We identify validation rules and evaluate debtor affinity towards a particular pattern. In the final preprocessing step, we then enhance the invoice data with the information on the debtors’ affinity towards appropriate behavioural patterns. To test the validity of the shared data and the effects of behavioural analysis on the date predictive power, multiple state-ofthe-art PD predictive models are generated and tested twice: in the control run, only basic invoice properties are used, and in the second run, behavioural pattern-enriched data are supplemented. The results are highly encouraging. Basic shared invoice data based models provide highly accurate predictions, ranging from 80% to 92.4%, confirming the first assumption. However, by including behavioural pattern-enriched data, the prediction accuracy improves by up to 5.6%, delivering the accuracy of the bestperforming model (NN) at 95.25%. Since all test results for all of the involved models are improved, the hypothesis testing succeeds beyond our expectations. To further elaborate the behaviour attributes, the eight most important attributes used in the best-performing open box model (LR) are presented. This provides evidence that the behavioural data do not play a primary role in the decision-making, but still have a significant effect on prediction accuracy. The approach proposed in this paper provides results that are new in operational research and financial management-related research: •









Identification of a new definition of a new PD metric: invoice PD, focused on supporting operational- and tactical-level financial management decision support Identification and definition of a new data source for predicting invoice PD: shared invoice data Definition of a behaviour analysis model for assessing payment strategies Confirmation of predictive power of basic invoice data and the significance of behaviour analysis to predictive power utilized by state-of-the-art predictive models. Explanation of the properties affecting the invoice-level PD.

The financial management research gains a new metric that substantially modifies the support of corporate decision making on the operational and tactical levels. By exploiting the invoice PD information, decisions on dues collection steps, insurance, and other credit risk mitigation processes can be made. Invoice PD information increases the accuracy of liquidity planning, resulting in higher optimisation of financial resources. Lastly, the short-term invoice PD can be integrated with long-term company-level PD to provide a complete picture of the landscape of business partners. The most important contribution to the OR community is the proof that the selection of the right data and its understanding have an immense impact on prediction accuracy. The shared business partners behaviour data, provide a promising data source for predicting business partners future behaviour. The sharing concept implies to resolve multiple issues related with appropriate under-

standing of the business environment. The sharing concept should be expanded to other business domains, as for instance sales quantities, quality of the products and services, timelines of deliveries, etc. The proposed behaviour analysis methods of identifying patterns and assessing affinity should be further upgraded by invoking data from other sources, and by involving intelligent analysis for acquiring better reasoning on company properties and situations in particular instances of (not) paying the invoice. We should also consider supplementing behavioural-based models with other data, and models, for instance time-series analysis. Although in the experiment the real data was used, before using the proposed concept in real life environment several issues must be dealt with: insufficient capacities to manage large data quantities, locality and time related differences in data meaning, the quality of the data analysed, to name a few. To improve the proposed approach usability, invoice PD evaluations for multiple thresholds are required. From daily PD in the first week, through weekly in the first month, to monthly for the next 90 days or maybe half a year. The behavioural analysis and the method for evaluating its impact on prediction accuracy presented in this paper should be utilised in other OR fields where behaviour plays an important role, as, for instance, in customer retention and supplier-related risk management.

Supplementary material Supplementary material associated with this article can be found, in the online version, at doi:10.1016/j.ejor.2016.08.039. References Allen, F. (1981). The prevention of default. Journal of Finance, 36(2), 271–276. doi:10. 2307/2327008. Altman, E. I., & Rijken, H. A. (2004). How rating agencies achieve rating stability. Journal of Banking & Finance, 28(11), 2679–2714. Ashby, W. R. (1956). An introduction to cybernetics. London: Chapman & Hall Retrieved from http:// pcp.vub.ac.be/ books/ IntroCyb.pdf . Barro, D., & Basso, A. (2010). Credit contagion in a network of firms with spatial interaction. European Journal of Operational Research, 205(2), 459–468. doi:10. 1016/j.ejor.2010.01.017. Basel, Committee (2006). Basel II: International convergence of capital measurement and capital standards: A revised framework. Besser, T. L. (1999). Community involvement and the perception of success among small business operators in small towns. Journal of Small Business Management, 37(4), 16–29. BIS http://www.bis.org/publ/bcbs189.htm Blochlinger, A. (2012). Validation of default probabilities. Journal of Financial and Quantitative Analysis, 47(5), 1089–1123. doi:10.1017/s00221090120 0 0324. Brier, W. G. (1950). Verification of forecasts expressed in terms of probability. Monthly Weather Review, 78, 1–3. Cao, L. (2010). In-depth behavior understanding and use: The behavior informatics approach. Information Sciences, 180(17), 3067–3085. doi:10.1016/j.ins.2010.03. 025. Chen, L., Koek, A. G., & Tong, J. D. (2013). The effect of payment schemes on inventory decisions: The role of mental accounting. Management Science, 59(2), 436–451. doi:10.1287/mnsc.1120.1638. Clark, P., & Niblett, T. (1989). The CN2 induction algorithm. Machine Learning, 3(4), 261–283. Coculescu, D., Geman, H., & Jeanblanc, M. (2008). Valuation of default-sensitive claims under imperfect information. Finance and Stochastics, 12(2), 195–218. doi:10.10 07/s0 0780-0 07-0 060-6. Crouhy, M., Galai, D., & Mark, R. (20 0 0). A comparative analysis of current credit risk models. Journal of Banking & Finance, 24(1-2), 59–117. doi:10.1016/ s0378-4266(99)0 0 053-9. Davidson, I., & Tayi, G. (2009). Data preparation using data quality matrices for classification mining. European Journal of Operational Research, 197(2), 764–772. doi:10.1016/j.ejor.2008.07.019. Davydenko, S. A., & Strebulaev, I. A. (2007). Strategic actions and credit spreads: An empirical investigation. Journal of Finance, 62(6), 2633–2671. Demsar, J., Curk, T., Erjavec, A., Gorup, C., Hocevar, T., Milutinovic, M., . . . Zupan, B. (2013). Orange: Data mining toolbox in Python. Journal of Machine Learning Research, 14, 2349–2353. Dionne, G., & Laajimi, S. (2012). On the determinants of the implied default barrier. Journal of Empirical Finance, 19(3), 395–408. doi:10.1016/j.jempfin.2012.03.004.

Please cite this article as: I. Perko, Behaviour-based short-term invoice probability of default evaluation, European Journal of Operational Research (2016), http://dx.doi.org/10.1016/j.ejor.2016.08.039

9

JID: EOR 10

ARTICLE IN PRESS

[m5G;September 5, 2016;13:24]

I. Perko / European Journal of Operational Research 000 (2016) 1–10

Doumpos, M., Kosmidou, K., Baourakis, G., & Zopounidis, C. (2002). Credit risk assessment using a multicriteria hierarchical discrimination approach: A comparative analysis. European Journal of Operational Research, 138(2), 392–412. doi:10.1016/s0377-2217(01)00254-5. Earle, J. S., & Sabirianova, K. Z. (2002). How late to pay? Understanding wage arrears in Russia. Journal of Labor Economics, 20(3), 661–707. Finlay, S. (2009). Are we modelling the right thing? The impact of incorrect problem specification in credit scoring. Expert Systems with Applications, 36(5), 9065– 9071. doi:10.1016/j.eswa.2008.12.016. Finlay, S. (2010). Credit scoring for profitability objectives. European Journal of Operational Research, 202(2), 528–537. doi:10.1016/j.ejor.2009.05.025. Florez-Lopez, R. (2010). Effects of missing data in credit risk scoring. A comparative analysis of methods to achieve robustness in the absence of sufficient data. Journal of the Operational Research Society, 61(3), 486–501. doi:10.1057/jors.2009.66. Harris, T. (2013). Quantitative credit risk assessment using support vector machines: Broad versus narrow default definitions. Expert Systems with Applications, 40(11), 4404–4413. doi:10.1016/j.eswa.2013.01.044. Hu, Y.-C., & Ansell, J. (2007). Measuring retail company performance using credit scoring techniques. European Journal of Operational Research, 183(3), 1595–1606. doi:10.1016/j.ejor.2006.09.101. Jappelli, T., & Pagano, M. (2002). Information sharing, lending and defaults: Cross– country evidence. Journal of Banking & Finance, 26(10), 2017–2045. Ju, Y., Jeon, S. Y., & Sohn, S. Y. (2015). Behavioral technology credit scoring model with time-dependent covariates for stress test. European Journal of Operational Research, 242(3), 910–919. doi:10.1016/j.ejor.2014.10.054. Kautonen, T., van Gelderen, M., & Tornikoski, E. T. (2013). Predicting entrepreneurial behaviour: A test of the theory of planned behaviour. Applied Economics, 45(6), 697–707. doi:10.1080/0 0 036846.2011.610750. Kennedy, K., Mac Namee, B., Delany, S. J., O’Sullivan, M., & Watson, N. (2013). A window of opportunity: Assessing behavioural scoring. Expert Systems with Applications, 40(4), 1372–1380. doi:10.1016/j.eswa.2012.08.052. Kim, H. S., & Sohn, S. Y. (2010). Support vector machines for default prediction of SMEs based on technology credit. European Journal of Operational Research, 201(3), 838–846. doi:10.1016/j.ejor.2009.03.036. Kononenko, I., & Bratko, I. (1991). Information-based evaluation criterion for classifiers performance. Machine learning, 6(1), 67–80. doi:10.10 07/bf0 0153760. Leow, M., & Crook, J. (2014). Intensity models and transition probabilities for credit card loan delinquencies. European Journal of Operational Research, 236(2), 685– 694. doi:10.1016/j.ejor.2013.12.026. Li, M.-Y. L., & Miu, P. (2010). A hybrid bankruptcy prediction model with dynamic loadings on accounting-ratio-based and market-based information: A binary quantile regression approach. Journal of Empirical Finance, 17(4), 818–833. doi:10.1016/j.jempfin.2010.04.004. Matthews, B. W. (1975). Comparison of predicted and observed secondary structure of t4 phage lysozyme. Biochimica Et Biophysica Acta, 405(2), 442–451. doi:10. 1016/0 0 05- 2795(75)90109- 9.

Orth, W. (2013). Multi-period credit default prediction with time-varying covariates. Journal of Empirical Finance, 21, 214–222. doi:10.1016/j.jempfin.2013.01.006. Pagano, M., & Jappelli, T. (1993). Information sharing in credit markets. Journal of Finance, 48(5), 1693–1718. Pan, S. J., & Yang, Q. (2010). A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10), 1345–1359. doi:10.1109/tkde.2009.191. Perko, I., & Mlinaric, F. (2016). Decreasing information asymmetry by sharing business data: A case of business non-payers sharing agency. International Journal of Risk Assessment and Management, 19, 1–2. Perko, I., Primec, A., & Horvat, R. (2015). Sharing business partner behavior. Kybernetes, 44(6-7), 1030–1048. doi:10.1108/k- 12- 2014- 0282. Piramuthu, S. (2006). On preprocessing data for financial credit risk evaluation. Expert Systems with Applications, 30(3), 489–497. doi:10.1016/j.eswa.2005.10.06. Powers, D. M. W. (2011). Evaluation: from precision, recall and F-Measure to ROC, informedness, markedness & correlation. Journal of Machine Learning Technologies, 2(1), 37. Premachandra, I. M., Bhabra, G. S., & Sueyoshi, T. (2009). DEA as a tool for bankruptcy assessment: A comparative study with logistic regression technique. European Journal of Operational Research, 193(2), 412–424. doi:10.1016/j. ejor.2007.11.036. Tong, E. N. C., Mues, C., & Thomas, L. C. (2012). Mixture cure models in credit scoring: If and when borrowers default. European Journal of Operational Research, 218(1), 132–139. doi:10.1016/j.ejor.2011.10.007. Uzzi, B., & Gillespie, J. J. (2002). Knowledge spillover in corporate financing networks: Embeddedness and the firm’s debt performance. Strategic Management Journal, 23(7), 595–618. doi:10.1002/smj.241. Wang, A. W., & Zhang, G. (2009). Institutional ownership and credit spreads: An information asymmetry perspective. Journal of Empirical Finance, 16(4), 597–612. doi:10.1016/j.jempfin.20 09.04.0 02. Webb, T. L., & Sheeran, P. (2006). Does changing behavioral intentions engender bahaviour change? A meta-analysis of the experimental evidence. Psychological Bulletin, 132(2), 249–268. doi:10.1037/0033-2909.132.2.249. Wolter, M., & Roesch, D. (2014). Cure events in default prediction. European Journal of Operational Research, 238(3), 846–857. doi:10.1016/j.ejor.2014.04.046. Wood, M. O., Noseworthy, T. J., & Colwell, S. R. (2013). If you can’t see the forest for the trees, you might just cut down the forest: the perils of forced choice on “seemingly” unethical decision-making. Journal of Business Ethics, 118(3), 515– 527. doi:10.1007/s10551- 012- 1606- x. Yao, X., Crook, J., & Andreeva, G. (2015). Support vector regression for loss given default modelling. European Journal of Operational Research, 240(2), 528–538. doi:10.1016/j.ejor.2014.06.043. Zhang, S. C., Zhang, C. Q., & Yang, Q. (2003). Data preparation for data mining. Applied Artificial Intelligence, 17(5-6), 375–381. doi:10.1080/08839510390219264. Zhu, B., He, C. Z., & Jiang, X. Y. (2015). Customer choice prediction based on transfer learning. Journal of the Operational Research Society, 66(6), 1044–1051. doi:10. 1057/jors.2014.65.

Please cite this article as: I. Perko, Behaviour-based short-term invoice probability of default evaluation, European Journal of Operational Research (2016), http://dx.doi.org/10.1016/j.ejor.2016.08.039