One Model to Rule Them All? Finding the Cracks in Your Global Credit Model

One Model to Rule Them All? Finding the Cracks in Your Global Credit Model

There's an ongoing debate about whether multiple specialized models or a single global model is better for credit risk modeling. I'm exploring this question within a company I'm working with, as I'm not sure if it's been addressed elsewhere. I think similar discussions happen in other supervised learning applications, too. I don't have a definitive answer yet, but I recently found a systematic way to highlight the weakness of the global model when applied to different samples.

According to machine learning theory, we can achieve low test error with enough data from the same sample distribution. However, supervised learning falls apart when the sample distribution changes. In the credit industry, this often happens due to differences in how users are acquired, data availability (for users with limited credit history), product offerings (like only offering premium products to certain users), and similar factors. In reality, we've consistently seen the global model perform worse in specific user groups.

Let's analyze the global model's weaknesses and strengths by pinpointing the segments that deviate the most from the average. A decision tree is well-suited for this task, as its output is a precise segmentation. However, the challenge arises when we don't have access to the label, such as KS, prior to segmenting. In essence, we're searching for segments that maximize the difference in KS (or other performance metrics of interest). Unfortunately, there's no existing sklearn package that can handle this. Fortunately, I discovered an "old school" method that achieves this: model-based recursive partitioning, which has an R implementation through partykit.

Model-based Recursive Partitioning (MOB)

Model-based recursive partitioning brings together recursive partitioning, which uses decision trees, and parametric statistical models, such as linear regression and logistic regression.

Aspect Standard Decision Tree Model-Based Recursive Partitioning
Terminal node content Constant value/class Parametric model (e.g., linear regression)
Flexibility within nodes None - same prediction for all Model adapts to local patterns
Parameters estimated One per node Multiple per node (model coefficients)
Relationship modeling Only through splits Splits + local relationships
Interpretability Simple rules Rules + local model effects

MOB begins with a global model, such as linear regression across all data. It then tests for parametric instability, checking if model parameters vary significantly across different subgroups. If the parameters differ significantly, the model splits the data and estimates separate models for each subgroup. This process repeats recursively.

My use case fits the model because performance can be represented as a parametric model. For instance, KS can be estimated by the slope of default rate versus model score. Missing data rate can be expressed as the weighted missing rate of the model's top features. Calibration accuracy can be represented by logit(E[Y|p̂]) = α + β·logit(p̂), and perfect calibration is achieved when α = 0 and β = 1.

Results

Although I'm not allowed to share the actual results, the model is doing a good job identifying user segments that show notable performance differences. The results also make sense and are easy to understand.

Reference