How many models is enough?

March 3, 2009
130 Views

I recently missed a presentation by a data mining software vendor (due to my recent paternity break), but I’ve been reviewing my colleagues notes and vendor presentation slides. I won’t name the vendor; you can probably work it out.

A significant part of the vendor solution is the ability to manage many, we’re talking hundreds, of data mining models (predictive, clustering etc).

In my group we do not have many data mining models, maybe a dozen, that we run on a weekly or monthly basis. Each model is quite comprehensive and will score the entire customer base (or near to it) for a specific outcome (churn, up-sell, cross-sell, acquisition, inactivity, credit risk, etc). We can subsequently select sub-populations from the customer base for targetted communications based upon the score or outcome of any single or a combination of models, or any criteria take from customer information.

I’m not entirely sure why you would want hundreds of models in a Telco (or similar) space…


I recently missed a presentation by a data mining software vendor (due to my recent paternity break), but I’ve been reviewing my colleagues notes and vendor presentation slides. I won’t name the vendor; you can probably work it out.

A significant part of the vendor solution is the ability to manage many, we’re talking hundreds, of data mining models (predictive, clustering etc).

In my group we do not have many data mining models, maybe a dozen, that we run on a weekly or monthly basis. Each model is quite comprehensive and will score the entire customer base (or near to it) for a specific outcome (churn, up-sell, cross-sell, acquisition, inactivity, credit risk, etc). We can subsequently select sub-populations from the customer base for targetted communications based upon the score or outcome of any single or a combination of models, or any criteria take from customer information.

I’m not entirely sure why you would want hundreds of models in a Telco (or similar) space. Any selection criteria applied to specific customers (say, by age, or gender, or state, or spend) before modeling will of course force a biased sample that feeds into the model and affects its inherent nature. Once this type of selective sampling is performed you can’t easily track the corresponding model over time *if* the sampled sub-population ever changes (which is likely because people do get older, move house, or change spend etc). For this reason I can’t understand why someone would want or have many models. It makes perfect sense in Retail (for example a model for each product or associations rules for product recommendations), but not many models that apply to sub-populations of your customer base.

Am I missing something here? If you are working with a few products or services and a large customer base, why would you prefer many models over a few?

Comments please 🙂
Link to original post