Skip to main content

Fintech platforms: Lax or careful borrowers’ screening?


Can peer-to-peer lending platforms mitigate fraudulent behaviors? Or have lending players been acting similar to free-riders? This paper constructs a new proxy to investigate lending platform misconduct and compares the FICO score and the LendingClub credit grade. To examine whether the lack of verification by the Fintech platform affects lenders’ collection performance, I explore the recovery rate (RR) of non-performing loans through a mixed-continuous model. The regression results show that the degree of prudence taken by the lending platform in the pre-screening activity negatively affects the detection of some misreporting borrowers. I also find that the Fintech platform’s missing verification information (e.g., annual income and employment length) affects the RR of non-performing loans, thereby hampering lenders’ collection performance.


This paper examines Fintech marketplaces’ roleFootnote 1 in affecting the credit quality of and detecting fraudulent behavior by borrowers. Lending marketplaces, also referred to as peer-to-peer (or P2P) lending, have become abundant by gaining huge market shares in consumer and small business loans over the last decade. P2P platforms are designed as a two-sided marketplace that, through leveraging innovative technologies, enables investors to lend to borrowers directly and provide broad benefits in cost and speed investment decisions. However, some suspect that the reliability of the P2P lending market has decreased over the last few years.Footnote 2 In 2016, the Department of Justice (DOJ)Footnote 3 and Securities Exchange Commission (SEC) accused LendingClub of false statements to financial institutions, wire fraud, and covered conduct. Renaud Laplanche was removed as CEO of the company by the board of directors. All the fraudulent activities were aimed at increasing LendingClub’s volume of loan originations by approving borrowers who did not satisfy the Credit Policy.Footnote 4 Lending platforms have generated internal risk rating models to gauge the riskiness of underlying applicants by using sophisticated algorithms and also by relying on self-reported borrower information (e.g., annual income and employment length). With such models, lending platforms prescreen loans, list some on their websites, and allocate applications into respective risk baskets. However, traditional concerns related to the burden of information asymmetries are intensified in the unsecured online market, as economic agents have no face-face contact. Some borrowers may have incentives to alter the submitted data by inflating asset information (Lucas 1976; Jagtiani and Lemieux 2018). However, the P2P lending platform acts without skin in the game, and lenders bear the credit risk. Borrowers could be encouraged to boost loan volumes by increasing their remuneration. This work aims to analyze players’ incentives in the growing crowdfunding market. Specifically, this is a pioneering attempt to investigate how the platforms’ incentives shape their behavior, leading them to act as dishonest brokers and not identify misleading borrowers. Therefore, this work measures the platform misconduct through two proxies. The first one is the Prudence index, created to capture the borrower screening quality on the platform. The second proxy lies in the internal verification process to identify whether a lack of checking the information reported by borrowers (annual income and employment length) could hamper lenders in their collections’ performance. The goal is achieved in two steps. To start, I construct a new index that attempts to capture the degree of prudence by the platform at the loan origination. The new index is built through the computation between the FICO score (e.g., the external rating provided from Credit Agencies Bureau) and LendingClub (LC afterward), taking the following values: 1 (low prudence, as if LC has been underestimating borrower riskiness), 2 (neutral level, the assessment of borrower riskiness is the same in both models) and 3 (high prudence, as if LC has been overestimating borrower riskiness). The Generalized Ordered Logistic (Gologit) regression is implemented because of the response variable’s features. The ratios of false statements, adopted in the financial literature of borrowers’ misreporting, are used as the main predictors of the prudence index. Following Garmaise (2015), I use the ratio of rounded reported income and loan amount. The relationship between rounded self-reported value and the delinquency rate is well documented in previous works (Eid et al. 2016; Pursiainen 2020; Talavera and Xu 2018). Polena and Regner (2018)  find that more words in the loan purpose description are associated with less creditworthy borrowers and higher default rates. Based on this work, I construct a third indicator that measures the number of words provided by the borrower in the description of loan purpose. A fourth indicator finds that the borrower inflates the length of employment to access better credit-line conditions. The empirical analysis shows that borrowers with ten years of employment are associated with a higher delinquency rate. The platform’s screening quality has been decreasing over the previous years, suggesting the implementation of more aggressive underwriting to recover lost volumes. Regression results confirm that all misreporting variables negatively impact the response variable, implying that the Fintech platform does not adjust the credit grade based on the potential borrowers’ false statements. In the second step, to evaluate whether the missing platforms’ verification process may harm lenders by not granting them minimal coverage in the case of borrowers’ default, I investigate the determinants of recovery rate (RR) on non-performing loans in P2P loans. I introduce the RR modeling of defaulted loans in the P2P lending market by using a mixed continuous-discrete model already established and applied in other studies in the mortgaged market (Chawla et al. 2016; Tanoue et al. 2017). Using a novel loan-level dataset from LendingClub between July 2007 and September 2018, I test the RR’s principal determinants on defaulted loans. Because RR distribution presents a higher concentration at a zero value, I apply a mixed continuous-discrete model based on the work byCalabrese (2014).Footnote 5 The regression results show that loan amount and the interest rate are positive determinants of the RR; unverified loans, rating volatility, and the number of borrower delinquencies negatively impact the RR. I test the robustness of the results by implementing the regressions within each risk class, leading to similar conclusions. The relationship between unverified loans and the RR is significantly negative. Therefore, if the P2P platform had enacted the verification process on information self-reported by borrowers, the RR and the loss given default would be higher and lower, respectively, resulting in better lenders’ collection performance. As a robustness check, I regress the variable verification process on the probability of default in an unreported analysis. The results confirm a positive relationship between the verification process and the borrowers’ default. However, to first address potential endogeneity by correcting for omitted variables, I have rerun the regression analysis by including the rating grade as a predictor. The positive relationship between the verification process and the probability of default is still held, as shown in Table 12. I split the initial sample by rating classes to further strengthen the results, proceeding with regressions for the three loan risk classes (e.g., A-B, C-D, and E-F-G). The positive relationship between the verification process and the probability of default is still confirmed in all risk classes, as shown in Table 13. This finding suggests that the verification process of borrowers’ self-reported information should be improved. Thus, this result may identify some negligent and opportunistic behavior of the online lending platform. This study contributes to the literature in several ways. First, the paper contributes to the burgeoning literature on the P2P lending market, filling a literature gap by examining the trade-off between maximizing profits and inferring borrowers’ quality in a completely unbiased manner. Thus far, the literature on the online lending market has mainly focused on how borrower’ soft and hard information affects the likelihood of default (Emekter et al. 2015; Carmichael 2014; Lu et al. 2012; Serrano-Cinca et al. 2015; Polena and Regner 2018) and the roles of alternative data and machine learning in improving access to credit and screening quality (Berg et al. 2018; Balyuk and Davydenko 2019; Duarte et al. 2012; Everett 2015; Freedman and Jin 2017; Hertzberg et al. 2018; Jagtiani and Lemieux 2018; Pope and Sydnor 2011; Shen et al.  2021). To the best of my knowledge, this is the first study to investigate how the Fintech platform could affect loan screening by inflating the borrower’s quality. Iyer et al. (2016) and Vallée and Zeng (2019) study how peer lenders can predict an individual’s likelihood of defaulting on a loan with greater accuracy than the borrower credit score, showing that sophisticated investors screen loans differently. Second, this work adds to the extensive stream of research on the financial and accounting misconduct that encompasses the relationship between CEO equity incentives and false statement (Bergstresser and Philippon 2006; Burns and Kedia 2006; Cheng and Warfield 2005; Efendi et al. 2007; Jensen and Meckling 1976; Efendi et al. 2007). It is also linked to this stream of research because similar to how CEOs could have personal incentives to falsify corporate balance sheets, the P2P lending platform could underestimate the borrowers’ credit risk to increase their remunerations by not adopting due diligence (Cumming et al. 2019). Third, the study integrates the literature on borrower misreporting in the mortgage market that finds a strong association between borrower’ misreporting and adverse loan outcomes (Agarwal and Ben-David 2014; Garmaise 2015; Griffin and Maturana 2015; Jiang et al. 2014; Piskorski et al. 2015). Also, this study is linked to works by Oleksandr and Xu (2018) based on loan verification and Pursainen (2020) that show that the LendingClub platform does not adjust the pricing on loans for misreporting borrowers. Finally, this paper contributes significantly to the growing literature on the estimation of the loss given default (LGD) and RR in the unsecured market (Calabrese 2014; Gourieroux and Lu 2019; Ye and Bellotti 2019; Siao et al. 2016; Zhou et al. 2018), advising lenders to focus on additional credit risk measures to accurately assess borrower creditworthy. In marketplace lending, the information asymmetries between borrowers and lenders lead to higher default rates and large LGD. Additionally, this study provides valuable insights to policymakers by highlighting critical factors that could lead to financial stability concerns due to the drying up of funding due to consumers’ loss of confidence. The paper also gives novel practical insights for lending platforms that might represent a concrete solution to credit rationing. Through its results, this study provides suggestions for lending platforms to improve the loan verification process to detect misreporting information by some borrowers and to strengthen their internal corporate governance, for instance, through the adoption of measures aimed to punish false statements by some applicants. The remainder of the paper is organized as follows. Related literature is reviewed in second section. Summary statistics are reported in third section. The empirical methodologies on the RR and prudence index are in fourth and fifth sections, respectively. Sixth section concludes the manuscript.

Related literature

Financial misreporting fraud

The extensive literature on financial misreporting fraud has examined why managers engage in corporate earnings by analyzing the equity incentives to misreport (Bergstresser and Philippon 2006; Burns and Kedia 2006; Cheng and Warfield 2005; Efendi et al. 2007; Jensen and Meckling 1976).Footnote 6 Misconduct, however, is an inevitable effect of the capital market. The burden of the analyst’s forecasts would bring pressure on managers, who are willing to destroy the value of firms to avoid severe punishment ofhe market (Degeorge et al. 1999). Financial misreporting may be facilitated when the CEO is also the firm’s founder, serves as chairman, or belongs to the founding family membersFootnote 7 (Agrawal and Chadha 2005; Dechow et al. 1996) because of more reliable connections with other top executives and directors (Altunabas et al. 2018; Khanna et al. 2015). The economic literature is rich, with empirical and theoretical studies highlighting the role of reputational loss in deterring financial misreporting and aggressive accounting policy (Giannetti and Wang 2016; Karpoff and Lott 1993; Murphy et al. 2009).Footnote 8 The monetary penalties for sued firms are lower than reputational loss imposed by the market (Karpoff and Lott 1993), which are nearly nine times the size of fines associated with wrongdoings. Thakor and Merton (2018) assert that trust is more difficult to gain than to lose. Its asymmetric nature could be enhanced in the P2P lending market because of the weaker incentives to maintain it than the traditional banking system. Banks, therefore, could have more substantial incentives to make good loans because they use the money raised through deposits, and the damage to the lender’s trust can endanger future fundraising, whereas Fintech platforms are investor-financed. The platforms’ incentives may impact the ability to distinguish between misleading and truthful borrowers. This stream of research is related to the liars’ loan problem, discussed widely in the mortgage loan market following the financial crisis (Jiang et al. 2014). Griffin and Maturana (2015) have sought to identify potential fraud through three indicators of misreporting on low and total documentation loans, finding that approximately 48% of loans had at least one sign of misrepresentation. Empirical evidence on mortgage loans shows that borrowers’ reporting asset information above the threshold rather than those just below were almost 25% points more likely to become delinquent (Garmaise 2015). On this basis are built the works by Eid et al. (2016) and Pursainien (2020) that, using a complete dataset from LendingClub, revealed that borrowers with a tendency to round their income are more likely to default than those with more accurate income reporting. Also, lenders are not compensated for additional risk associated with rounding borrowers priced with a lower interest rate. Despite their limitations, the studies mentioned above collectively explain why misconduct has become an important issue and potential proxies to measure misconduct risk in the financial market.

P2P loan performance and lax screening

A large body of contemporary studies has examined different features of P2P lending. The first stream of research has focused on the importance of soft and hard information in mitigating asymmetric information in borrower-lender interactions. The traditional bankruptcy prediction models for small and medium enterprises (SMEs) use accounting-based financial ratios typically. Kou et al. (2021) have proved the economic benefit of transactional data and payment network-based variables for bankruptcy prediction. Several aspects can contribute to predicting the credit risk of borrowers,Footnote 9 such as the economic value of networks and online friendships (Lin et al. 2013; Freedman and Jin 2017), maturity choice of loans as a signal of the higher risk of worsening of creditworthiness (Yao et al. 2019; Hertzberg et al. 2018), social media information (Iyer et al. 2016), digital footprint (Berg et al. 2018; Ge et al. 2017) and borrowers’ characteristics (Carmichael 2014; Emekter et al. 2015; Serrano-Cinca et al. 2015).Footnote 10 According to Basel Accords, investors should be mindful of the default rates and LGD in making investment decisions and assessing credit risk for loans. Recently, studies have sought to evaluate the LGD in the P2P setting. For instance, Zhou et al. (2018) present the first model of LGD, using data from LendingClub, and describe the probability density function of LGD as a unimodal distribution with the high value peaking in the unsecured bond market. They also find negative relationships between credit grade, debt-to-income ratio, and LGD, and that borrowers’ total assets do not have a significant impact. In contrast, Papoušková and Hajek (2020) assert that LGD does not follow the normal distribution, and they have adopted a random forest learning method to reduce overfitting. I follow the perspective by Ye and Bellotii (2019) and Calabrese (2014) that have used beta mixture regression in modeling RR on non-performing loans in the mortgage market. Furthermore, the literature has mainly examined the relationship between lenders and borrowers in the P2P lending context, looking at the platform as an honest broker and borrowers as misleading users. According to Cumming et al. (2019), one important issue is what role the platform should play in the governance of crowdfunding marketplaces. Fee structures in the lending market affect how platforms carry out their core business, seeking to maximize the revenue they make. Fraser et al. (2015) state that although online platforms can disentangle financial constraints, their role in the context of monitoring and governance is still unclear. However, platforms’ activity should be examined because they serve a double purpose: they are at the same time a credit agency in screening loans and providers of investment decisions (Bertsch and Rosevinge 2019). Banks retain a fraction of all originated loans, thus acting as a signal of asset quality by ensuring that they have skin in the gameFootnote 11 (Daley et al. 2020), unlike P2P platforms that are reluctant to retain a fraction of originated loans. Likewise, the rating issue from the Credit Rating Agency (CRA), which has skin-in-the-game requirements, is more accurate than those who do not have these requirements (Ozerturck 2015). According to Lucas’s critique (1976), a statistic model could be deceiving because agents’ incentives change or alter data’s real nature.Footnote 12 Platforms might be tempted to reduce lending standards by offering too many low-quality loans to boost loan volume beyond sustainable levels, thereby negatively affecting unskilled investors who rely on its judgment (Balyuk and Davydenko 2019). Consistent with this view, Keys et al. (2010) empirically demonstrated that the securitization process affected the adverse selection problem by increasing financial intermediaries’ incentives to screen borrowers carelessly. Recently, few studies have attempted to evaluate the effectiveness of the credit scoring systems used by P2P lending platforms. Wang et al. (2021) state that the credit rating of loans is vital in assessing default risk. Their study is the first to study cost-sensitive classifiers and measure misclassification costs of different credit grades in P2P lending. Jagtiani and Lemieux (2018) have found a lower correlation between FICO scores and LC grade from approximately 80 to 35% for loans that originated in 2014–2015. They state that a significant portion of borrowers, previously classified as subprime based on the FICO score, are slotted into a better risk class. Gao et al. (2017), using the loan data from, a Chinese P2P lending platform, have classified the platform’s evaluation systems as forward-looking based on borrowers’ information, with backward-looking mechanisms based on their historical repayments. They have shown that the backward-looking system encourages bad borrowers to default after they have earned high enough credit scores to borrow a large amount, suggesting the need to improve the credit scoring model. Talavera et al. (2018), using data from a leading Chinese lending platform, prove a positive relationship between the default rates of loans and borrowers with incomplete verified information. The LendingClub platform asks borrowers to provide some personal information. Specifically, the self-reported data are annual income and length of employment. LendingClub could ask the potential borrower to verify the self-reported information or only its source, for instance, the source of income or the company where the borrower works. Some borrowers obtain funding without information verification. Therefore, the verification process seems to be a subsidiary activity in the lending market (Carmichael 2014; Jagtiani and Lemieux 2018; Polena and Regner 2018). The adoption of due diligence mitigates potential reputation costs and litigation resulting from loans that should not have been originated due to lower quality (Cumming et al. 2019). Tao et al. (2017) also note that because of the lack of official credit records of borrowers and the information submitted by themselves on which the platforms’ credit rating system is based, inaccurate or false data is not easily identifiable in the verification process. Platforms should use due diligence in adopting a more robust verification mechanism to improve the efficiency of the crowdfunding market. Therefore, the verification system could offer an alternative way to decrease adverse selection problems, not only in detecting fraud and liars’ loans by validating borrowers’ documentation but also, according to Signaling Theory (Spence 1973), as a signal of the asset quality by increasing its reputation. For instance, has developed both online and offline verification tools, such as physical site visits, to check the information submitted by borrowers, increasing lenders’ trustworthiness and guaranteeing the survival of the crowdfunding market (Huang et al. 2021; Tao et al. 2017).

The hypothesis development

According to prior studies that have used different measures and explanations to explore how players’ incentives affect their conduct in the market (Chami et al. 2010; Gorton and Pennacchi 1995; Mason et al. 2009), more research is necessary to understand this issue in the crowdfunding market. Hildebrand et al. (2017) examined the players’ incentives in the crowdfunding market for the first time by providing empirical evidence on adverse incentives that are not fully recognized in the market. However, only a few studies have attempted to explore the role of online lending platforms in assessing and monitoring borrowers’ creditworthiness with conceptual discussion. The crowdfunding platform is driven by profit and ethical or reputational concerns (Cumming et al. 2019; Hildebrand et al. 2017). The impact of fee structures on their behaviors in the crowdfunding market remains unclear. To fill this gap, in this paper, I investigate the linkage between the lending platform’s prudence degree, the proper detecting of misleading borrowers, and the lenders’ collections performance. From the discussions above, I have drawn the following hypothesis:


P2P lending platforms’ incentives affect the signaling of misreporting borrowers, thereby hampering lenders’ collection performance (e.g., recovery rate).

The theoretical framework is shown in Fig. 1.

Fig. 1
figure 1

Theoretical framework

To test this hypothesis, I construct a new proxy of platforms’ misconduct by comparing the LC rating grade and FICO score. I adopt this index to explore whether the assessment of borrowers’ riskiness through automated credit grading algorithms based on machine learning techniques (e.g., LC rating grade) is different from traditional credit scoring (e.g., FICO score). Therefore, if the platform has been underestimating borrowers’ riskiness, it could harm lenders’ collections in the case of borrowers’ default. The platforms’ incentives have significant implications for both lenders and borrowers because improper conduct of platforms may lead to the collapse of the crowdfunding market (Hildebrand et al. 2017; Vismara 2018).

Data description and descriptive statics

LendingClub operativity overview

The research is focused on LendingClub, which is a leading platform in the US that was established in 2007; it was the first lender to register its offerings as securities with the SEC. By the end of December 2019, it had provided almost 3 million loans, with a total lending amount of over $42.53 billion. Borrowers need to comply with the requirements of the platform to successfully apply for a loan. For instance, LC rejects borrowers with FICO scores less than 600, a credit history of less than three years, and a debt-to-income ratio of more than 40%. The potential borrowers have to report their annual income, employment status, current home situation, and other personal information. If they overcome the constraints, LC assigns them a fixed interest rate based on its credit grades, which range from A (the lowest risk) to G (the highest risk), with subgrades from A1 to G5. For instance, the average interest rate for A1 was 5.53% and for G5 was 29.14% between July 2007 and September 2018. Typically, Fintech platforms might verify if their reported income is within 10% of their actual income or the employment source. However, the platforms grant loans before conducting the verification because it has made only a target of loans. If the borrowers’ self-reported data are not truthful, such as overstated income, their applications cannot be removed, and they can still go ahead with loans without any penalty. Furthermore, LC has tested a new verification process in the fourth quarter of 2018 to reduce friction for borrowers seeking loans. Consistently, LC does not guarantee the trustworthiness of borrowers’ data, but it attempts to take reasonable actions in mitigating liars’ applications.

Sample construction and descriptive statistics

As outlined in the previous section, I use a personal loan dataset from LendingClub, which encompasses all consumer loans issued between July 2007 and September 2018. The sample ends in the third quarter of 2018 to observe loan performance over almost 2 years post-origination. I drop loans that did not meet the credit policy, which originated between 2009 and 2010.Footnote 13 The total sample contains 1,959,440 loans. The detailed descriptions and the descriptive statistics of the variables used in the empirical analyses are reported in Table 1. In constructing the prudence index and misreporting variables, I focus on consumer loans, both current and mature. For the RR and LGD analysis, the empirical study only involves mature loans, resulting in 1,494,741 borrowers’ records.

Table 1 Variable description and descriptive statistics

As shown in the table above, the average loan amount selected by borrowers is approximately $15,000, and the average interest rate of the loan is 13%. LendingClub provides two types of loans: 36-month short-term loans and 60-month long-term loans. In the sample, 60% of loans have a short maturity of approximately five years. For borrowers’ self-reported data, the average annual income is $79,563, with approximately 5 years of employment. Almost 50% of applicants have a mortgaged home, and only 10% own one. On applicants’ indebtedness features from the Credit Bureau, most applicants have low FICO scores of approximately 700. However, the loan applicants have an average of 12 credit lines opened and approximately 24 completed lines, and the median rate of utilization is approximately 50%. Concerning the verification of reported data by borrowers, it emerges that 32% have neither stated income nor verified their employment status; in contrast, another 40% are source verified, and the remainder are both. Approximately, 78% of consumer loans are represented by credit card and debit consolidation purposes, as shown in Fig. 2. The distribution of loans by status in the sampling period is shown in Table 2.

Fig. 2
figure 2

Consumer loans by the stated purpose. Notes. This graph shows loan distribution by a self-reported goal by borrowers. As highlighted, a large part is specified to be used for consolidating borrowers’ liabilities

Table 2 Loan status by each year

Evaluation of prescreening activity

The weakness of the control system could incentivize some borrowers to inflate or falsify their self-reported information. These sections investigate the prescreening quality at the time of loan origination, using as a proxy of platform misconduct the degree of prudence adopted by the platform in allocating borrowers in specific risk classes. The empirical analysis is performed with loan-level data from LendingClub, including current loans issued between July 2007 and September 2018. Firstly, I begin by evaluating the FICO score and LC grade discrepancies through the correlation analysis. As shown in Fig. 3, the relationship has decreased from 76% for loans issued in 2007 to 40% in 2018. Figure 3 presents the composition of loans for each LC rating grade and FICO score based on the loan verification status. Some consumers, defined as subprime with a FICO score below 680, are slotted into better loan classes based on LC’s rating grade in the unverified and source verified stage (Fig. 4).

Fig. 3
figure 3

Correlation between FICO score and the LC rating grade. Notes. This graph displays the association between two rating systems on the whole data set, involving also current loans. As can be seen, their relationship has been changing over time, confirming a declining trend. Nevertheless, the relation in 2016 seems to improve slightly once again

Fig. 4
figure 4

Fico and LC grade distribution by Verification status. Notes. These graphs show the relationship between FICO and LC grade by loans’ verification status. The empirical analysis is being performed on the whole data set, including loans issued between 2007 and Q3 2018

Consequently, I focus on the ability of the LC rating grade to infer borrower quality to evaluate the prudence grade of the LendingClub marketplace over time. A higher prescreening activity means good loans are more accurately distinguished from bad loans, screened out, or in a risk bucket. Following the literature (see Iyer et al. 2016; Vallèe and Zang 2019), I measure the accuracy of the LC prescreening activity by building receiver operating characteristic (ROC) curves. To perform our analysis, I assess the likelihood of charged-off loans using LC class grade as predictors. Thus, a higher AUC means that the system is a good predictor of defaulted loans.Footnote 14 The test is computed separately on loans between 2015 and 2018 to capture how the screening quality of Fintech platforms evolves. ROC analysis results are displayed in Fig. 5, showing that the predicted power for defaulted loans of LC’s rating grade has decreased by 0.03 percentage points over the last years.

Fig. 5
figure 5

ROC curve of Lending Club grades. Notes. This Figure plots the ROC curve, which plots the positive true rate, also called Sensitivity versus true false rate (1-Specificity), obtained by using the LC rating grade as a predictor of defaulted loans. The analysis is performed each year separately on loans issued between 2016 and 2018. The larger the AUC, the better the model is

Prudence index

To proxy the degree of prudence of the platform’s risk management, I construct a prudence index, defined as the difference between LC credit grade and FICO scores, namely internal and external ratings, respectively. The prudence index attempts to capture the underestimation of risk by Fintech platforms. I aim to investigate the determinants that affect prudence taking by the lending platform. They might be encouraged to excessively underestimate credit risk to increase their remuneration. Firstly, I operationalize LC’s rating grade from categorical to continuous, where G is 1 (Highest risk), F is 2, … and A is 7 (lowest risk). For instance, if the LC rating grade assigns borrowers into the G risk class, this means that they have a higher likelihood of defaulting on loans. Then, we classify FICO scores into seven segmentsFootnote 15 from the lowest to highest score, in which borrowers with high credit risk are assigned to the lowest score, for instance, a score lower than 660. The splitting of the FICO scores within different risk buckets is based on the work by Balyuk and Davydenko (2019). The prudence index is built through the difference between the two reclassified credit risk systems by taking three levels: low prudence = 1, same prudence = 2, and high prudence = 3. For instance, the difference between LC and FICO scores is greater than 1 when a borrower receives a score equal to 3 by LC and equal to 1 by FICO. This means that LC allocates some borrowers in a lower risk class (i.e., 3) than FICO, which assigns the same borrowers in a higher risk class (i.e., 1). In the opposite case, if the difference is lower than 1, LC allocates some borrowers to a higher risk class (i.e., 3), while FICO assigns them to a lower risk class (i.e., 1).

In this case, the FICO score has underestimated the borrower’s riskiness. Instead, both FICO and LC assess borrowers’ riskiness in the same way in the middle case, for instance, when the two credit rating systems assign the same score to borrowers. The index takes value 1 when the LC rating grade underestimates the borrower’s risk (lowest prudence); 2 if the LC credit assessment is similar to FICO, and 3 when the LC rating is more prudent than the FICO score (highest prudence). In the empirical analysis, the prudence index takes value 1 for approximately 88% of the whole sample, confirming that LC’s rating grade has included borrowers as A-rated or B-rated most of the time. Therefore, to assess the LC marketplace’s screening prudence, we use misreporting variables that have been already used in literature and are associated with significantly higher borrowers’ delinquency rates. Following previous studies on the financial literature, the borrower’s inaccurate or untruthful information can signal potential misreporting. Following behavioral studies, when people are asked to estimate a value, they are inclined to provide a rounded estimation. This tendency is more likely in people lacking specific knowledge or documentation. Previous studies on this topic have also shown that borrowers reporting above-rounded number values for their assets have significantly higher delinquency rates in the P2P lending market (Eid et al. 2016). P2P loans with a goal amount to a round number are associated with a lower probability of finding success in the reward-based crowdfunding market (Lin and Pursiainen 2018). Based on potential misreporting indicators established in the financial literature (Garmaise 2015; Pursiainen 2020), I identify the self-reported annual income of borrowers as misreporting when it is divisible by 5,000 and 10,000. The cut-off points are set in the literature and reflect the rounding of values reported by people. Based on previous works by Garmaise (2015), Eid et al. (2016), and Pursiainen (2020), we adopt the following misreporting indicators taking true value as when 1) reported income is divisible by 5,000 and 2) reported income is divisible by 10,000, 3) the loan amount is divisible by 5,000, and 4) the loan amount is divisible by 10,000. I add to the literature on loans with the following variable: 5) the length of loan title provided by borrowers and 6) the suspicion of a false statement about the length of employment. Table 3 lists summary statistics for misreporting variables, and in Fig. 6, the relationships between the delinquency rate and the length of employment at the maximum level are displayed.

Table 3 Summary statistics of misreporting variables
Fig. 6
figure 6

Employment length by delinquency rate. Notes. This graph shows the employment length by the average delinquency rate. The empirical analysis is performed on the loan-level dataset issued from LendingClub between June 2007 and September 2018. Both current and matured loans are included. The maximum level of the working year is associated with the highest rate of delinquency

Factors affecting the degree of prudence

Thus far, the analysis results indicate that LC rating has facilitated the slotting of some borrowers into a better risk class compared to the external rating. However, this may lead to an excessive underestimation of borrower risk by destroying lenders’ profits. I use a proxy to measure screening quality and the prudence index explained in the last section. I aim to investigate this issue and the determinants that affect the degree of prudence of LendingClub. The generalized ordered logit estimated was estimated as follows (Williams 2006):

$$\mathrm{ln}\left({\mathrm{Y}}{^\prime}_{\mathrm{I}}\right)={\mathrm{\alpha }}_{1}+{\upbeta }_{1 }\times{{\mathrm{Rounded}}_{\mathrm{Income}}}_{\mathrm{i}}+{\upbeta }_{2 }\times{\mathrm{Rounded}\_\mathrm{Amount}}_{\mathrm{i}} +{\upbeta }_{3 }\times\mathrm{Lengt}{\mathrm{h}\_\mathrm{Title}}_{\mathrm{i}} +{\upbeta }_{4 }\times\mathrm{Suspect}\_\mathrm{emp}+ {\upbeta }_{5 }\mathrm{x }{\mathrm{X}}_{\mathrm{I}} +{\upvarepsilon }_{\mathrm{I}}$$

The dependent variable is the degree of prudence of the LC rating grade ranging from 1 to 3. Following the literature on borrower misrepresentation, the income roundness, the loan amount roundness, the number of words provided by borrowers in the title description, and the suspicion of inflating the working years are used as predictors. Xi is a vector of the control variable, including loan information and borrower’s characteristics. The Gologit model estimates the odds of being beyond a certain category (highest prudence) or to be at or below that category (lowest prudence). The Brant Test was used to evaluate the proportional odds (PO) assumption, resulting in the violation of some covariates. Following Williams (2006), we use the partial proportional odds model, which holds constant covariates that meet the PO and allows one or more coefficients to move freely across different categories of the response variable. For the three types of the dependent variable, two equations were fitted. Some variables have a single constant odds ratio across all the three equations because the PO is fulfilled, for instance, debt-to-income ratio, home mortgaged, length of employment, and annual income. In contrast, other variables have different coefficients for each of the prudence categories, and their effects vary across the levels of the response variable. Table 4 displays the results.

Table 4 Regressions results

There are two takeaways from this analysis. First, in the overall models, the screening activity does not improve whether the loans are not verified by the platform, revealing any negligent behavior in assessing borrowers’ creditworthiness. This effect is robust in the neutral category (3 vs 1 and 2), decreasing by 40% the odds of being above a category (high prudence) versus being in that category. Second, all misreporting variables strongly predict the screening quality by the LC rating grade, suggesting that the risk associated with these characteristics is not entirely incorporated by the platform when listing loans. These predictors assume different coefficients for each category of the response variable because Odds parallel lines assumptions are violated. Variables with an odds ratio less (higher) than 1 indicate that the LendingClub scoring models underestimate (overestimate) the borrower risk. I start to focus on Model 1, in which only two misreporting variables were included. For instance, borrowers who report incomes rounded to the threshold of 5,000 are negative predictors of the dependent variables, suggesting that with one unit increase in the roundness of income, the prudence quality decreases by 8.5% and 10.2%.Footnote 16 Consistent with our view, the variable loan amount seems to decrease the odds of prudence, suggesting an underestimation of the borrower risk again. Conversely, the squared specification of loan amount indicates that the platform increases the standard quality, prompting substantial growth in loan volumes to avoid a collapse of the market. Except for the debt-to-income (DTI) ratio that positively impacts the response variable, the other hard information such as months since recent inquires is negative but significant predictors. It indicates that the Fintech platform focuses widely on the DTI in screening activity, neglecting additional credit risk information. In the second model, the previous misreporting variables are replaced from the two specifications of the roundness of the loan amount. The covariates are still strongly significant after controlling for other borrower information (e.g., home status and employment length). The roundness of loan amount varies between different thresholds, highlighting a decrease of 20.9% in the odds of being in the best category of prudence corresponding with the higher level of roundedness. The last model shows that the screening quality does not increase when some borrowers provide a longer description of the loan. However, borrowers who give a long loan purpose description are associated with high rates of default. This interpretation is consistent with the view that the platform’s screening quality is careless in distinguishing between good and bad loans. In contrast, the level of prudence is strengthened versus borrowers who report a length of employment at an extreme value, not confirming the hypothesis that the platform neglects borrowers who state a maximum period of work. The prescreening activity appears to be more severe versus borrowers who use loans to invest in small business purposes. At the same time, there is little prudence against the debt consolidation and credit card purposes that represent the majority of the loans issued on the platform.

Robustness checks

To ensure the robustness of the regression results presented in the last section, I have adopted a new classification of the FICO score based on a further criterion. The new assessment of FICO scores is based on the percentiles taken from the variable. Specifically, the first cut off point (FICO <  = 667) takes all values below the 10th percentile; the second bin takes all values within the 10th and 25th percentiles (668–677); the third takes all values within the 25th and 50th percentiles (678–692); the fourth takes all values within the 50th and 75th percentiles (693–717); the fifth takes all values within the 75th and 90th percentiles (718–747); the sixth takes all values within the 90th and 95th percentiles (748–767); the last bin takes all values within the 95th and the 99th percentiles (FICO > 767). Table 5 shows the regression results performed by using as the response variable, the prudence grade constructed on the new assessment of FICO scores, with the same predictors as those shown in Table 4. As we can see, the regression results appear almost unchangeable by confirming the robustness of the previous results.

Table 5 Regression results with the dependent variable Prudence grade

Calculation of recovery rate

In the previous section, I have investigated the platform’s misconduct through the prudence index by confirming the Fintech platform’s inability to detect some misreporting borrowers. This section explores whether the lack of a verification process of the information reported by borrowers (e.g., annual income and employment length) harms the lenders’ collection performance.

To date, the LGD and RR are less studied than the probability of default in the P2P lending market. RR represents the proportion of money that lenders can successfully recover once the borrower has defaulted on the funding minus the administration fees during the collection period. In contrast, LGD is defined as the proportion of money investors fail to recover, given that the borrower has already defaulted. The equations of LGD and RR are reported below:

$$\mathrm{Recovery Rate}=\frac{\sum \mathrm{ Recoveries }- \sum \mathrm{ Collection recovery fee }}{\mathrm{Exposure at Default}}$$


$$\mathrm{Loss Given Default}=1- \frac{\sum \mathrm{ Recoveries }- \sum \mathrm{ Collection recovery fee }}{\mathrm{Exposure at Default}}$$

Typically, the RR and LGD lie in the interval (0,1) with high peaking values at the boundary levels 0 and 1. The RR could be less than 0 if recoveries are lower than the administration fee and greater than 1 if recoveries are more than the collection fee. The denominator is defined as the outstanding loan balance when the loan defaults. The RR of all default loans issued on LendingClub is estimated with Eq. (2), and LGD with Eq. (3). Variables are winsorized at 1% and 99% levels to mitigate the influence of outliers. The descriptive statistics of the RR and the LGD are listed in Table 6.

Table 6 Descriptive statistics of RR and LGD

As shown above, the average values of the RR and LGD are 9.8% and 90.2%, respectively, indicating a sizeable total default loss and insufficient collection. It suggests that LendingClub originates loans with extreme credit risk, consistent with the lower RR value in the overall unsecured market. To test the normal assumptions of the empirical distribution of the RR, the density function is estimated using the kernel method of defaulted loans of LendingClub, resulting in the distribution displayed in Fig. 7. It can be seen clearly that the RR does not follow the normal distribution, with a high spike at boundary value 0 and several peaking values at 0.15. It is further strengthened by the Kolmogorov–Smirnov test that I applied as a robustness check. Analysis results of the RR and LGD of defaulted loans of LendingClub show that the priority for protecting lenders against credit risk is relatively low, suggesting that the Fintech platforms have carried out feeble efforts in the debt collection activity.

Fig. 7
figure 7

Kernel density of recovery rates in the sample. Notes. This graph shows the density distributions of the recovery rates on defaulted loans in the sample. The stack of 0 s shows the frequency of RR = 0, resulting in a not unimodal distribution

Moreover, it is observed that the mean recovery rate is country-level heterogeneous, as presented in Fig. 8. The borrowers’ locations are based on the first three-digit ZIP code, captured into ten dummy variables concerning the classification of the United States. The lowest recovery rate is between zone 0 and zone 6 where, for instance, Connecticut, Massachusetts, Illinois, and other states are included.

Fig. 8
figure 8

Mean-recovery rate by country. Notes. This graph shows the average distribution of the Recovery Rate of the loans issued on the LendingClub platform by each country. The US is classified in 10 zones based on the borrower’s first three digits of the ZIP code

The RR modeling has risen as a challenging task since it does not have a normal distribution. Recent statistic models have proposed a two-stage model; mixed continuous-discrete distributions.Footnote 17 Beta regression, zero–one inflated beta regression, beta mixture models with logistic regression, and fractional regression have been applied, as shown in Table 7. The models’ prediction performances have been estimated through two indexes of accuracy, namely, root mean square error (RMSE) and mean absolute error (MAE).Footnote 18 All models have been trained on the same dataset, avoiding potential bias due to different data. In applying the standard beta regression, I have adopted the transformation of the response variable proposed in the work of Smithson and Verkuilen (2006) to include 0 and 1 values. In terms of predictive power, the zero–one inflated beta regression model seems to perform better than others. Based on these findings, the RR is being modeled with the zero–one inflated beta regression.

Table 7 Models’ comparison

Zero-inflated beta regression of recovery rate

In the last sections, I focused on RRs’ density function for LendingClub loans. What are the determinants of the low RRs? Are they the same within each risk class? This section aims to present RR modeling on non-performing loans through a mixed continuous-discrete model adopted in the literature to estimate LGD in the unsecured market. I perform the zero–one inflated beta regression (ZOIB)Footnote 19 with two components, which are simultaneously developed: (1) a logistic regression that models the predicted probability for whether or not borrowers have no recovery rate (RR = 0); and a (2) beta regression model that analyses the degree of RR between 0 and 1 (0 < RR < 1). Following Cragg (1971) and Cook et al. (2008), the logit link is used to model pi as a function of explanatory variables, defined from the following equations:

$$\mathrm{Logit}\left({\mathrm{p}}_{\mathrm{i}}\right)={\mathrm{\alpha }+\upbeta }^{\mathrm{^{\prime}}}\mathrm{Notverified}+\updelta {\mathrm{Z}}_{\mathrm{i}}$$

The response variable is RR on non-performing loans in the sample. Not verified is an indicator variable that reflects the status of loan verification for the information submitted by applicants. The vector Zi includes a set of controls, for instance, loan characteristics (interest rate, loan amount, and maturity), borrowers’ solvability information (debt to income ratio, months since the last delinquency, number of inquiries, number of banking accounts, rate of revolving utilization, mortgage account) and self-reported variables (annual income, borrower working years, homeownership status, loan purpose, state of borrower). The verification process is used as a second proxy for analyzing the platform’s misconduct, as the lower frequency of borrower verification could hide an attempt to boost loan volume. I regress the RRs of defaulted loans on the platform’s verification process, and the results are shown in Table 8.

Table 8 Regression results recovery rate in the overall sample

The dependent variable is the RR on defaulted loans, and it is the same in all models. The results are presented both through the beta regression (e.g., for proportional values, 0 < RR < 1) and logistic models (whether or not RR = 0). The first model investigates the relationship between RR and loan contract information and mainly reflects the variable Not Verified on the RRs. The first result of this model shows an increase in borrowers with incomplete verified information results in the lower RRs. Specifically, the coefficients for Not Verified are negative in the beta models, reflecting a decrease in RRs’ proportional value, and positive in the zero-inflated component by increasing the predicted probability that investors could not have recovered the rate after the borrowers were charged off (logit component). In terms of practical significance, the change of the variable from 1 to 0 decreases the response variables by 0.7 ppt, as shown in Table 9.

Table 9 Average marginal effects

Regarding loan contract information, the signaling effect of the interest rate on borrower riskiness seems to fail. The negative coefficients of the interest rate in the beta specifications suggest that borrowers with higher interest rates do not present an RR equal to zero. This result sheds light on the pricing mechanism of loans applied for by the platform, resulting in the need to improve it. In the second specification, some control variables related to borrowers’ indebtedness have been added, and the effect of Not Verified on the RRs remains unchanged. Borrowers with banking accounts in which the balance is in the upper-to-high limit and whose last delinquency occurred recently cause a decrease of RR, as reported from the positive coefficient in the zero-inflated specifications. However, borrowers with repeated bank relationships, measured in terms of the total banking accounts and revolving utilization, significantly impact RR in the zero-inflated component. It might suggest that the tie between borrowers and banks could enhance their accountability to avoid losing reputation and raise future funding from banks. Finally, in the third model, variables related to the total assets of the borrower provided by the Credit Bureau have been dropped, and self-reported information has been added. Concerning these variables, regression results indicate that RR is significantly negatively correlated with the borrowers’ indebtedness ratio (e.g., loan amount to annual income). This finding implies that an excessive increase in debts beyond the safety threshold can absorb the majority of that income, undermining the borrower’s solvability. Housing ownership has a low significant impact on RR for the mortgaged borrower, unlike the borrower’s annual income, which causes a significant positive influence on RR.

Borrowers who decide to use the funding for small business development have lower RRs than those who apply for loans for credit card and debit consolidation purposes. The negative and significant relationship between the dummy year and RR confirms a decrease in RRs over the last years. It might be a potential consequence of fraud detection by the SEC at the beginning of 2016. The suspicion of being a dishonest broker could have decreased accountability by consumers and encouraged fraudulent behavior. The quality of the three specifications has been tested through the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC). The first model achieves the best goodness of fit for both parameters. Consequently, loan contract information seems to be the best predictor of RRs in our empirical analysis.

Regression results of the recovery rate within risk classes

In the previous section, the zero–one inflated beta regression was applied to all risk classes based on the LC grade. This section aims to investigate the determinants of the RRs within each loan risk class separately. To achieve this goal, the risk classes are subdivided into low-risk class (involving loans with grade A or B), medium-risk class (loans with grade C or D), and high-risk class (loans with grade E–F or G). The zero–one inflated regressions for the three risk classes allow evaluation of the difference between the regression result from given risk classes and the whole dataset. Regression results within each risk class are displayed in Table 10.

Table 10 Regression results within each risk-class

The results, as shown in Table 10, provide additional support for the results in the last section. The indicator variable related to the verification process of borrower data decreases the average proportion of RR within each risk class, confirming the previous findings in the whole dataset. Regarding bank transactions borrower variables, the regression results for each class separately and the full dataset only differ slightly. For the loan contract information, the variable loan amount is a significant negative predictor in zero-inflation models in all risk classes, suggesting that an increase of the amount causes the decrease of predicted probability (RR = 0) with a significant effect in the low-risk class. The coefficients term of the loan, months since the last delinquency, and loan amount to annual income ratio are negative and highly significant in all risk classes. The variable interest rate is a negative predictor in low-risk and high-risk classes, but it is not significant in the medium-risk class. The interest rate should be an essential predictor of the borrower’s riskiness. However, it yields a correct prediction only in the low-risk category, reducing the RR on average by 0.002 points. Concerning self-reported information, the borrower’s annual income is not significant in the low-risk and high-risk classes, while it is significant in the medium-risk class. The length of employment positively impacts the RR in the medium- and high-risk categories but not in the low-risk ones. The strong negative relationship between small business purpose and RR is confirmed only in the medium-risk class. Overall, the robustness test demonstrates that the main predictor of the RRs is the verification status of the loans. Strengthening the verification process might lead to advantages both for lenders in mitigating the harmful effects of the LGD and for the same platform in terms of reputation. Moreover, these results underline the weakness of the risk management framework adopted by the LC, as the low RR could highlight the careless screening activity of borrowers (Table 11).

Table 11 Average marginal effects

Discussion and conclusion

P2P Fintech platforms represent an essential source of alternative funding, thereby fostering credit democratization. Lending platforms perform functions similar to that of traditional intermediaries, such as loan evaluation, pricing, and screening activity (Balyuk and Davydenko 2019). However, lending platforms have a challenging position because of the trade-off between improving the borrower screening activity and maximizing loan volume. Their incentives could lead them to boost loan originations by decreasing credit quality. This paper presents some significant findings regarding how players’ incentives shape their behavior and the theoretical and practical implications, as discussed below. First, this study contributes to the literature on financial misconduct and P2P lending by exploring the impact of platforms’ incentives on assessing borrower riskiness. Although a growing body of research provides valuable discussions of the determinants of default and funding success in the crowdfunding market (e.g., Lee and Lee 2012; Morse 2015; Serrano-Cinca et al. 2015), how the platforms’ incentives affect their behavior in terms of the borrower screening activity has not yet been investigated. My regression results show that the degree of prudence taken by the lending platform does not improve whether some borrowers report misleading information. However, I find that the platform increases the standard of quality only when some borrowers present misreporting characteristics that are easier to identify, such as the self-reported length of employment at an extreme level. The screening quality of the LC marketplace seems to have decreased over the past years.

On the one hand, it could be ascribable to challenging times in the sector and the episodes of fraud detection in general. On the other hand, the financial restatements imposed by SEC and DOJ may have prompted LendingClub to adopt more aggressive loan underwriting, thereby resulting in a decrease in prudence. I also show that the borrowers’ rating does not increase when the platform does not verify some loans. According to the signaling theory in the crowdfunding market (Ahlers et al. 2015), whether the Fintech platforms would adopt due diligence at loan origination, such as strengthening the verification process for information self-reported by borrowers, could signal their trustworthiness in the market. Second, this work contributes to the P2P lending literature by exploring the relationship between the determinants of recovery rate and the lack of verification process on the information reported by borrowers. Few studies have analyzed the key drivers of the recovery rate in the P2P lending market (Pursiainen 2020; Zhou et al. 2018). The regression results suggest that borrowers with incompletely verified information negatively affect the recovery rate, harming the lenders’ collection performance. Although this paper is focused only on one P2P lending platform, and as such, the external validity of the findings may not be generalizable, it provides some preliminary practical insights for lenders, consumers, and policymakers to guarantee the market’s survival.

This research has important implications for lenders in the P2P lending market. The results suggest that the lending platform is unable to detect some misreporting borrowers, thus harming the lenders’ collection performance. My results are consistent with the theoretical insights from behavioral finance and psychology (Chao 2021; Jansen and Pollmann 2001), stating that misreporting borrowers have a higher tendency to communicate round values about their assets. Thus, these results can help lenders in marketplace lending make more informed investment decisions by incorporating an index of borrowers’ misreporting (e.g., the rounding of income, the rounding of amount) into their decision-making process. Also, this research encourages lenders to be mindful of the default rates and the LGD in assessing the credit risk of loans. This research provides additional knowledge regarding the dynamics of the crowdfunding market, thereby enhancing understanding of the platform’s role in the governance of lending marketplaces. Policymakers should pay attention to this market as lending platforms act without skin in the game, do not take deposits, and do not perform maturity transformation. This would make them vulnerable to decreased standards in loan evaluations. Thereby, policymakers should adopt due diligence to ensure that lending platforms constantly fulfill the prescreening activity’s standard quality.

The limitations of this study be overcome in further research. Future works might corroborate the findings using a new sample to increase external validity. For instance, the hypothesis developed in this work can be replicated in other crowdfunding platforms by comparing the different crowdfunding mechanisms (e.g., reward-based, donation-based, lending-based, and equity-based). These further studies can help understanding which typology of crowdfunding presents a higher likelihood of misconduct and whether the verification process works better depending on the model implemented. A second direction for future studies could include other factors accounted for in my studies, such as macroeconomics and cultural context, to explore whether the main results continue to hold in a different setting. Third, future research should investigate whether a cut-off point exists beyond which maximizing platforms’ incentives harms lenders. Finally, the socio-economic crisis due to the COVID-19 pandemic has been reshaping all economic activities. Researchers should investigate the crowdfunding market’s resilience in such an uncertain time and whether the standard quality in the screening of loans has decreased to restore the volume of loans.

Availability of data and materials

Data used in this paper were collected from LendingClub.


  1. The term “Fintech marketplace” refers to the involvement of institutional investors within the P2P lending market in funding loans. For a detailed definition see FSB, 2017.

  2. In December 2015, the Chinese Government accused Ezubao, an online lending platform, revealing that approximately 95% of its investment projects were fake and fabricated with information bought from other companies. The executive company confirmed the fraudulent activity and 21 Ezubao officials were arrested (Albercht et al. 2017).

  3. Further, the investigation proved that a fund traced to the CEO who resigned had bought 115 million worth of LendingClub loans without previously disclosing the conflict of interest (DOJ settlements 2016).

  4. LendingClub is already implicated in class-action lawsuits in California, where the company has been accused of “making materially false and misleading statements in the registration and prospectus issued with the IPO,” and in New York, where people received usury loans through the platform (Business Insiders 2016).

  5. The boundaries value is modelled through the binary logistic regression (e.g., the predicted probability of RR being 0 versus not 0) and the continuous part (0–1) from the beta distribution.

  6. In general, they argue that managers’ compensation ties to stock options provide them with several incentives in the engagement of aggressive accounting policies, which, in turn, results in improper conduct.

    Research on this topic is mixed, and there is still no predominant picture. For example, Erickson, Hanlon, and Maydew (2006) measured equity incentives of firms accused of fraud from the SEC, but they did not find a link beteen executive equity incentives and fraud. Findings of Armstrong et al. (2010) suggest that financial misstatement affects the trade-off risk-rewards, involving positive and negative effects, and equity incentives make it more likely when managers are less averse to equity risk.

    Burns and Kedia (2006) find that other components of compensation (i.e., salary plus bonus, equity, restricted stock) do not affect the propensity to misreport, as they do not introduce convexity in CEO wealth.

  7. For instance, more than 70% of financial misconduct occurs in founder’s firms due to their overconfidence and hubris (Amiram et al. 2017).

  8. Firms accused of financial misreporting may be subject to direct costs as represented by monetary fines and other penalties that have the potential to reduce approximately 38% of the firms’ value (Karpoff et al. 2008).

  9. Other studies have investigated the role of text descriptions for predicting loan default through text mining analysis (see Herzenstein et al. 2011; Gao and Lin 2012; Nowak et al. 2018) and the likelihood of discrimination against black or female borrowers (see Pope and Sydnor 2011; Duarte et al. 2012; Ravina 2012; Loureiro and Gonzalez  2015, Dorfleitner et al.  2016).

  10. Using a dataset from LendingClub, they state that more significant variables of default are credit grade assigned by LC and involve credit line utilization.

    For instance, Polena and Regner (2018), using a LendingClub dataset from January 2009 to December 2012, found that annual income, credit grade, inquires in the past six months, loan purpose, credit card, and small business are significant determinants of default within each risk class.

  11. Large strands of literature focus on the moderator effects of financial misconduct (Li et al. 2017, Nguyen et al. 2015) and explore the role of skin in the game in discouraging any wrongdoing (Gorton and Pennacchi 1995; Holmstrom and Tirole 1997).

  12. Estrada and Zamora (2016) state that a lower screening cost and a higher benefit from projects act as incentives to screen carefully.

  13. In early 2009, LendingClub started a new program “Test proposal.” In Proposal T1, the platform approved loan applications from borrowers who did not meet certain requirements. Initially, these loans were declined by the Credit Department, but they were successfully approved under the TP1 program. In February 2009, LendingClub began TP2 programs by approving loans that did not overcome constraints as well as in TP1. The Fintech platform aimed to boost the loan origination volume. The Credit Department was aware of the underperformance of these, resulting in a punishment by the Department of Justice and the SEC (DOJ Settlement Agreements 2016).

  14. It represents a performance measurement of a binary classifier by plotting the true positive rate (TPR) against the false positive rate (FPR). AUC (area under curves) measures the accuracy of the screening system, ranging from 0 (the worst tool) to 1 (perfect tool). Iyer et al. (2016) state that an AUC of 0.6 or greater is considered valuable in large information-asymmetry environments, while an AUC of 0.7 or greater is a desirable goal in rich-information contexts.

  15. In specific: value 1 if FICO score is lower than 660; value 2, 3, 4, 5, 6 and 7 if it is 660 to 679, 680 to 699, 700 to 739, 740 to 759, 760 to 779, and above 780, respectively.

  16. The largest odds ratio is identified in the neutral category that compares groups 3 vs 1 and 2. However, the effect is smaller when the income is rounded at a higher threshold.

  17. This family of distributions, introduced by Ospina and Ferrari (2012), allows us to model data that assume values in [0, 1), (0, 1] or [0, 1].

  18. Lower values of RMSE and MAE indicate a better fit, suggesting that the model can predict the response variable accurately.

  19. Zero adjusted beta regression is more appropriate for modeling dependent variables containing large numbers of 0.


  • Agarwal S, Ben-David I (2014) Do loan officers' incentives lead to lax lending standards?. National Bureau of Economic Research

  • Agrawal A, Chadha S (2005) Corporate governance and accounting scandals. J Law Econ 48(2):371–406

    Article  Google Scholar 

  • Ahlers GK, Cumming D, Günther C, Schweizer D (2015) Signaling in equity crowdfunding. Entrepreneurship Theory Pract 39(4):955–980

  • Albrecht C et al (2017) Ezubao: a Chinese Ponzi scheme with a twist. J Financ Crime

  • Altunbaş Y, Thornton J, Uymaz Y (2018) CEO tenure and corporate misconduct: evidence from US banks. Financ Res Lett 26:1–8

    Article  Google Scholar 

  • Amiram D, Beaver WH, Landsman WR, Zhao J (2017) The effects of credit default swap trading on information asymmetry in syndicated loans. J Financ Econ 126(2):364–382

    Article  Google Scholar 

  • Armstrong C, Jagolinzer AD, Larcker DF (2010) Performance-based incentives for internal monitors. Rock Center for Corporate Governance at Stanford University working paper series, (76)

  • Balyuk T, Davydenko SA (2019) Reintermediation in FinTech: evidence from online lending

  • Berg T, Burg V, Gombović A, Puri M (2018) On the rise of fintechs–credit scoring using digital footprints (No. w24551). National Bureau of Economic Research

  • Bergstresser D, Philippon T (2006) CEO incentives and earnings management. J Financ Econ 80(3):511–529

    Article  Google Scholar 

  • Bertsch C, Rosenvinge CJ (2019) FinTech credit: Online lending platforms in Sweden and beyond. Sveriges Riksbank Econ Rev (sweden) 2:42–70

    Google Scholar 

  • Burns N, Kedia S (2006) The impact of performance-based compensation on misreporting. J Financ Econ 79(1):35–67

    Article  Google Scholar 

  • Calabrese R (2014) Predicting bank loan recovery rates with a mixed continuous-discrete model. Appl Stoch Model Bus Ind 30(2):99–114

    Article  Google Scholar 

  • Carmichael D (2014) Modeling default for Peer-to-Peer Loans (November 21, 2014). Available at SSRN:

  • Chami R, Fullenkamp C, Sharma S (2010) A framework for financial market development. J Econ Policy Reform 13(2):107–135

    Article  Google Scholar 

  • Chao R (2021) Optimization of China’s financial advertising regulation system: based on behavioral finance and EU experience. J Shanghai University Finance Econ 23(02):136–152

  • Chawla G, Forest LR Jr, Aguais SD (2016) Point-in-time loss-given default rates and exposures at default models for IFRS 9/CECL and stress testing. J Risk Manag Financ Inst 9(3):249–263

    Google Scholar 

  • Cheng Q, Warfield TD (2005) Equity incentives and earnings management. Account Rev 80(2):441–476

    Article  Google Scholar 

  • Cook DO, Kieschnick R, McCullough BD (2008). Regression analysis of proportions in finance with self selection. J Empirical Finance 15(5):860–867

  • Cragg JG (1971) Some statistical models for limited dependent variables with application to the demand for durable goods. Econometrica 39(5):829

  • Cumming DJ, Johan SA, Zhang Y (2019) The role of due diligence in crowdfunding platforms. J Bank Finance 108:105661

    Article  Google Scholar 

  • Daley B, Green B, Vanasco V (2020) Securitization, ratings, and credit supply. J Financ 75(2):1037–1082

    Article  Google Scholar 

  • Dechow PM, Hutton AP, Sloan RG (1996) Economic consequences of accounting for stock-based compensation. J Account Res 34:1–20

    Article  Google Scholar 

  • Degeorge F, Patel J, Zeckhauser R (1999) Earnings management to exceed thresholds. J Bus 72(1):1–33

    Article  Google Scholar 

  • Dorfleitner G, Priberny C, Schuster S, Stoiber J, Weber M, de Castro I, Kammler J (2016) Description-text related soft information in peer-to-peer lending–Evidence from two leading European platforms. J Bank Finance 64:169–187

  • Duarte J, Siegel S, Young L (2012) Trust and credit: the role of appearance in peer-to-peer lending. Rev Financ Stud 25(8):2455–2484

    Article  Google Scholar 

  • Efendi J, Srivastava A, Swanson EP (2007) Why do corporate managers misstate financial statements? The role of option compensation and other factors. J Financ Econ 85(3):667–708

    Article  Google Scholar 

  • Eid N, Maltby J, Talavera O (2016) Income rounding and loan performance in the peer-to-peer market. Available at SSRN 2848372

  • Emekter R, Tu Y, Jirasakuldech B, Lu M (2015) Evaluating credit risk and loan performance in online Peer-to-Peer (P2P) lending. Appl Econ 47(1):54–70

    Article  Google Scholar 

  • Erickson M, Hanlon M, Maydew EL (2006) Is there a link between executive equity incentives and accounting fraud? J Account Res 44(1):113–143

    Article  Google Scholar 

  • Estrada D, Zamora P (2016) P2P lending and screening incentives

  • Everett CR (2015) Group membership, relationship banking and loan default risk: the case of online social lending. Bank Finance Rev 7(2)

  • Fraser S, Bhaumik SK, Wright M (2015) What do we know about entrepreneurial finance and its relationship with growth? Int Small Bus J 33(1):70–88

    Article  Google Scholar 

  • Freedman S, Jin GZ (2017) The information value of online social networks: lessons from peer-to-peer lending. Int J Ind Organ 51:185–222

    Article  Google Scholar 

  • Gao Q, Lin M (2013) Linguistic features and peer-to-peer loan quality: a machine learning approach. Available at SSRN, 2446114.

  • Gao Y, Sun J, Zhou Q (2017) Forward looking vs backward looking: an empirical study on the. China Finance (7/2)

  • Garmaise MJ (2015) Borrower misreporting and loan performance. J Financ 70(1):449–484

    Article  Google Scholar 

  • Ge R, Feng J, Gu B, Zhang P (2017) Predicting and deterring default with social media information in peer-to-peer lending. J Manag Inf Syst 34(2):401–424

    Article  Google Scholar 

  • Giannetti M, Wang TY (2016) Corporate scandals and household stock market participation. J Financ 71(6):2591–2636

    Article  Google Scholar 

  • Gorton GB, Pennacchi GG (1995) Banks and loan sales marketing nonmarketable assets. J Monet Econ 35(3):389–411

    Article  Google Scholar 

  • Gourieroux C, Lu Y (2019) Least impulse response estimator for stress test exercises. J Bank Finance 103:62–77

    Article  Google Scholar 

  • Griffin JM, Maturana G (2015) Notice of withdrawal: ‘who facilitated misreporting in securitised loans?’’.’ J Finance 70(6):2897–2898

    Article  Google Scholar 

  • Hertzberg A, Liberman A, Paravisini D (2018) Screening on loan terms: evidence from maturity choice in consumer credit. Rev Financ Stud 31(9):3532–3567

    Article  Google Scholar 

  • Herzenstein M, Dholakia UM, Andrews RL (2011) Strategic herding behavior in peer-to-peer loan auctions. J Interact Mark 25(1):27–36

  • Hildebrand T, Puri M, Rocholl J (2017) Adverse incentives in crowdfunding. Manag Sci 63(3):587–608

    Article  Google Scholar 

  • Holmstrom B, Tirole J (1997) Financial intermediation, loanable funds, and the real sector. Q J Econ 112:663–691

    Article  Google Scholar 

  • Huang J, Sena V, Li J, Ozdemir S (2021) Message framing in P2P lending relationships. J Bus Res 122:761–773

    Article  Google Scholar 

  • Iyer R, Khwaja AI, Luttmer EF, Shue K (2016) Screening peers softly: Inferring the quality of small borrowers. Manag Sci 62(6):1554–1577

    Article  Google Scholar 

  • Jagtiani J, Lemieux C (2018) The roles of alternative data and machine learning in fintech lending: evidence from the LendingClub consumer platform

  • Jansen CJ, Pollmann MM (2001) On round numbers: pragmatic aspects of numerical expressions. J Quant Linguist 8(3):187–201

    Article  Google Scholar 

  • Jensen M, Meckling W (1976) Theory of the firm: managerial behavior, agency costs and ownership structure. J Financ Econ 3:3

    Article  Google Scholar 

  • Jiang W, Nelson AA, Vytlacil E (2014) Liar’s loan? Effects of origination channel and information falsification on mortgage delinquency. Rev Econ Stat 96(1):1–18

    Article  Google Scholar 

  • Karpoff J, Lott J (1993) The reputational penalty firms bear from committing criminal fraud. J Law Econ 36:757–802

    Article  Google Scholar 

  • Karpoff J, Lee DS, Martin GS (2008) The consequences to managers for cooking the books. J Financ Econ 88:193–215

    Article  Google Scholar 

  • Keys BJ, Mukherjee T, Seru A, Vig V (2010) Did securitization lead to lax screening? Evidence from subprime loans. Q J Econ 125(1):307–362

    Article  Google Scholar 

  • Khanna V, Kim EH, Lu Y (2015) CEO connectedness and corporate fraud. J Financ 70(3):1203–1252

    Article  Google Scholar 

  • Kou G, Xu Y, Peng Y, Shen F, Chen Y, Chang K, Kou S (2021) Bankruptcy prediction for SMEs using transactional data and two-stage multiobjective feature selection. Decis Support Syst 140:113429

  • Lee E, Lee B (2012) Herding behavior in online P2P lending: an empirical investigation. Electron Commer Res Appl 11(5):495–503

    Article  Google Scholar 

  • Li C, Li J, Liu M, Wang Y, Wu Z (2017) Anti-misconduct policies, corporate governance and capital market responses: International evidence. J Int Finan Mark Inst Money 48:47–60

    Article  Google Scholar 

  • Lin TC, Pursiainen V (2018) Fund what you trust? social capital and moral hazard in crowdfunding. Social Capital and Moral Hazard in Crowdfunding (July 31, 2018)

  • Lin M, Prabhala NR, Viswanathan S (2013) Judging borrowers by the company they keep: Friendship networks and information asymmetry in online peer-to-peer lending. Manag Sci 59(1):17–35

    Article  Google Scholar 

  • Loureiro YK, Gonzalez L (2015) Competition against common sense: insights on peer-to-peer lending as a tool to allay financial exclusion. Int J Bank Mark 33(5):605–623.

  • Lu Y, Gu B, Ye Q, Sheng Z (2012) Social influence and defaults in peer-to-peer lending networks

  • Lucas RE (1976) Econometric policy evaluation: a critique. Carn-Roch Conf Ser Public Policy 1:19

    Google Scholar 

  • Mason W, Watts DJ (2009) Financial incentives and the" performance of crowds". In: Proceedings of the ACM SIGKDD workshop on human computation , pp 77–85

  • Morse A (2015) Peer-to-peer crowdfunding: information and the potential for disruption in consumer lending. Annu Rev Financ Econ 7:463–482

    Article  Google Scholar 

  • Murphy DL, Shrieves RE, Tibbs SL (2009) Understanding the penalties associated with corporate misconduct: an empirical examination of earnings and risk. J Financ Quant Anal 44(1):55–83

    Article  Google Scholar 

  • Nguyen DD, Hagendorff J, Eshraghi A (2015) Can bank boards prevent misconduct? Rev Finance 20(1):1–36

    Article  Google Scholar 

  • Nowak A, Ross A, Yencha C (2018) Small business borrowing and peer-to-peer lending: evidence from lending club. Contem Economic Policy 36(2):318–336

  • Oleksandr T, Xu H (2018) Role of verification in peer-to-peer lending. Working papers 2018-25, Swansea University, School of Management

  • Ospina R, Ferrari SL (2012) A general class of zero-or-one inflated beta regression models. Comput Stat Data Anal 56(6):1609–1623

    Article  Google Scholar 

  • Ozerturk S (2015) Moral hazard, skin in the game regulation and rating quality. Skin in the Game Regulation and Rating Quality (March 27, 2015)

  • Papoušková M, Hajek P (2020) Modelling loss given default in peer-to-peer lending using random forests. In Intelligent decision technologies 2019. Springer, Singapore, pp 133–141

  • Piskorski T, Seru A, Witkin J (2015) Asset quality misrepresentation by financial intermediaries: evidence from the RMBS market. J Financ 70(6):2635–2678

    Article  Google Scholar 

  • Polena M, Regner T (2018) Determinants of borrowers’default in P2P lending under consideration of the loan risk class. Games 9(4):82

    Article  Google Scholar 

  • Pope DG, Sydnor JR (2011) What’s in a picture? Evidence of discrimination from prosper. com. J Hum Resour 46(1):53–92

    Google Scholar 

  • Pursiainen V (2020) Borrower misreporting in peer-to-peer loans (January 31, 2020).

  • Ravina E (2012) Love and loans: the effect of beauty and personal characteristics in credit markets, Columbia Univ. Working Paper

  • Serrano-Cinca C, Gutiérrez-Nieto B, López-Palacios L (2015) Determinants of default in P2P lending. PLoS ONE 10(10):e0139427

    Article  Google Scholar 

  • Smithson M, Verkuilen J (2006) A better lemon squeezer? Maximum-likelihood regression with beta-distributed dependent variables. Psychol Methods 11(1):54

  • Shen LH, Khan HU, Hammami H (2021) An empirical study of lenders’ perception of chinese online peer-to-peer (P2P) lending platforms. J Altern Investments 23(4):152–175

  • Siao JS, Hwang RC, Chu CK (2016) Predicting recovery rates using logistic quantile regression with bounded outcomes. Quant Finance 16(5):777–792

    Article  Google Scholar 

  • Spence M (1973) Job market signaling. Q J Econ 87(3):355

  • Tanoue Y, Kawada A, Yamashita S (2017) Forecasting loss given default of bank loans with multi-stage model. Int J Forecast 33(2):513–522

    Article  Google Scholar 

  • Tao Q, Dong Y, Lin Z (2017) Who can get money? Evidence from the Chinese peer-to-peer lending platform. Inf Syst Front 19(3):425–441

    Article  Google Scholar 

  • Thakor RT, Merton RC (2018) Trust in lending (No. w24778). National Bureau of Economic Research

  • Vallee B, Zeng Y (2019) Marketplace lending: a new banking paradigm? Rev Financ Stud 32(5):1939–1982

    Article  Google Scholar 

  • Vismara S (2018) Signaling to overcome inefficiencies in crowdfunding markets. In The economics of crowdfunding. Palgrave Macmillan, Cham, pp 29–56

  • Wang H, Kou G, Peng Y (2021) Multi-class misclassification cost matrix for credit ratings in peer-to-peer lending. J Oper Res Soc 72(4):923–934

  • Williams R (2006) Generalized ordered logit/partial proportional odds models for ordinal dependent variables. Stata J 6(1):58–82

    Article  Google Scholar 

  • Yao J, Chen J, Wei J, Chen Y, Yang S (2019) The relationship between soft information in loan titles and online peer-to-peer lending: evidence from RenRenDai platform. Electron Commer Res 19(1):111–129

    Article  Google Scholar 

  • Ye H, Bellotti A (2019) Modelling recovery rates for non-performing loans. Risks 7(1):19

    Article  Google Scholar 

  • Zhou G, Zhang Y, Luo S (2018) P2P network lending, loss given default and credit risks. Sustainability 10(4):1010

    Article  Google Scholar 

Download references


Not applicable.


This reasearch is not funded by a specif project grant.

Author information

Authors and Affiliations



The whole work, namely the analysis of literature, empirical analysis and discussion of the results are conducted by the corresponding author.

Corresponding author

Correspondence to Serena Gallo.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Further analysis

Appendix: Further analysis

See Tables 12 and 13.

Table 12 Endogeneity check
Table 13 Endogeneity check: regressions for the three risk classes

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gallo, S. Fintech platforms: Lax or careful borrowers’ screening?. Financ Innov 7, 58 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: