An Indian bank wanted to switch from rule based loan default prediction to a predictive solution.
The banking client was using rule based solution to predict the probability of default (PD) of their vehicle loan customers. They would score the customers based on their loan repayment behaviour. They wanted to make use of predictive scorecards to understand the factors driving default, draw transparent interpretations out of the model and if they could predict the probability of default (PD) with a higher accuracy than with a rule based practice.
Data & advanced analytics for improved accuracy
We used a predictive, binary classification model to estimate the probability that a customer would fail to repay the loan
- Established the definition of default to align with customer behaviour
- 78% lift in top 3 deciles
- AUC of 0.81 with KS estimate of 0.41 for the validation data
The client integrated the output from the predictive model to integrate with the business rules and was able to reduce the default rates.
Manual & Opaque process
The client was using rule based and credit expert based assessment of default risk. They had invested in data infrastructure and database management systems and were ready for some advanced analytics solutions for their operational processes. Vehicle loan default was high on their agenda owing to a higher than acceptable default rates and no clear understanding of how to address it.
The primary requirement for a predictive model based approach to probability of default (PD) prediction was the requirement of a transparent, automated solution that would integrate with their existing IT infrastructure. They wanted to have a better understanding of the attributes leading to higher default propensity. We proposed a logistic regression based approach to PD scorecard solution as it aligns well with their requirement of transparency.
Tailored, flexible and transparent solution
Logistic regression based Probability of Default (PD) scorecard provides a standalone solution for default prediction.
With API based integration to the databases, an internal tool has been developed for the bank to generate default propensity scores, with:
a. A balanced solution with speed and accuracy
b. Fully transparent solution with attribute level weight estimations towards default probabilities
c. Customized and flexible solution with source codes to enable client with future improvement
Though logistic regression based PD scorecards are one of the earliest methods, it is one of the most successful and widely used, owing to the transparency.