One Model to Rule Them All? Finding the Cracks in Your Global Credit Model

ZhiZhi Gewu

26 Aug 2025 — 2 min read

There's an ongoing debate about whether multiple specialized models or a single global model is better for credit risk modeling. I'm exploring this question within a company I'm working with, as I'm not sure if it's been addressed elsewhere. I think similar discussions happen in other supervised learning applications, too. I don't have a definitive answer yet, but I recently found a systematic way to highlight the weakness of the global model when applied to different samples.

According to machine learning theory, we can achieve low test error with enough data from the same sample distribution. However, supervised learning falls apart when the sample distribution changes. In the credit industry, this often happens due to differences in how users are acquired, data availability (for users with limited credit history), product offerings (like only offering premium products to certain users), and similar factors. In reality, we've consistently seen the global model perform worse in specific user groups.

Let's analyze the global model's weaknesses and strengths by pinpointing the segments that deviate the most from the average. A decision tree is well-suited for this task, as its output is a precise segmentation. However, the challenge arises when we don't have access to the label, such as KS, prior to segmenting. In essence, we're searching for segments that maximize the difference in KS (or other performance metrics of interest). Unfortunately, there's no existing sklearn package that can handle this. Fortunately, I discovered an "old school" method that achieves this: model-based recursive partitioning, which has an R implementation through partykit.

Model-based Recursive Partitioning (MOB)

Model-based recursive partitioning brings together recursive partitioning, which uses decision trees, and parametric statistical models, such as linear regression and logistic regression.

Aspect	Standard Decision Tree	Model-Based Recursive Partitioning
Terminal node content	Constant value/class	Parametric model (e.g., linear regression)
Flexibility within nodes	None - same prediction for all	Model adapts to local patterns
Parameters estimated	One per node	Multiple per node (model coefficients)
Relationship modeling	Only through splits	Splits + local relationships
Interpretability	Simple rules	Rules + local model effects

MOB begins with a global model, such as linear regression across all data. It then tests for parametric instability, checking if model parameters vary significantly across different subgroups. If the parameters differ significantly, the model splits the data and estimates separate models for each subgroup. This process repeats recursively.

My use case fits the model because performance can be represented as a parametric model. For instance, KS can be estimated by the slope of default rate versus model score. Missing data rate can be expressed as the weighted missing rate of the model's top features. Calibration accuracy can be represented by logit(E[Y|p̂]) = α + β·logit(p̂), and perfect calibration is achieved when α = 0 and β = 1.

Results

Although I'm not allowed to share the actual results, the model is doing a good job identifying user segments that show notable performance differences. The results also make sense and are easy to understand.

One Model to Rule Them All? Finding the Cracks in Your Global Credit Model

ZhiZhi Gewu

Model-based Recursive Partitioning (MOB)

Results

Reference

Read more

Lease Decay

5 Reviews at 99% vs. 1000 Reviews at 90%: Which Seller is Better?

My Incomplete Mental Models on Thinking About A Business

The Trap of Belief: Why We Get Stuck in Our Ways and How Our Brains Resist Reality