Financial inclusion remains a pressing global concern, with a vast number of people across the globe not having access to fundamental financial services. Such services can act as stepping stones to enhance one’s economic position. Yet, the absence or inadequacy of credit histories stands in the way of many, hindering their access to these pivotal services. The traditional banking landscape often employs rigorous credit scoring models, making it increasingly challenging for these ‘unbanked’ individuals to gain loan approvals.
Home Credit Group, an esteemed international consumer finance provider, is actively addressing this issue. Their mission is centered around extending financial inclusion to this marginalized segment of the population. Their approach not only focuses on offering financial solutions but ensures that these are characterized by safety and positivity. This strategic approach also aids customers in progressively establishing their credit profiles. To make their credit assessment holistic and all-encompassing, Home Credit Group integrates a myriad of alternative data sources into their evaluation process. Noteworthy among these are telecommunications and transactional records.
To provide a bit more granularity on the data that Home Credit Group processes: they handle a considerable 356,255 instances. This data is a mix of both continuous and categorical attributes, with 60 of them being continuous and 11 being categorical. There’s a distinction in the data classes: a minor 8.07% is indicative of fraud, while the majority, 91.93%, is non-fraudulent.
Despite the vastness and diversity of the data, the crux of the challenge remains in harnessing the full potential of this alternative data to deliver precise credit risk predictions. Home Credit Group, while already employing an array of statistical and machine learning techniques, continually seeks to refine and enhance these methods. Their overarching goal is twofold: to minimize the chances of inadvertently rejecting creditworthy clients and to craft loan offers that stand the best chance of being successfully serviced by the borrowers.
Through careful model selection, hyperparameter tuning, and mindful training, we aimed to create robust machine learning models that can accurately detect credit card fraud, thus saving substantial resources for financial institutions. We can see the results in the following graph:
While advanced machine learning models can provide high accuracy, they often come with the cost of being complex and difficult to interpret. The ‘black box’ nature of these models may not be ideal in a context where understanding the decision-making process is important, as in lending decisions.
Interpretability in this context refers to the extent to which we can understand the decisions made by our machine learning model. We need to know which factors are considered significant by the model in predicting a client’s repayment ability, to ensure fairness and transparency in the lending process.