Evaluating Successful Predictive Analytics Solutions

April 19, 2011
110 Views

Numerous studies exist regarding the use of many different kinds of mathematical techniques.  From an academic standpoint, the arguments supporting the merits of using one technique over the other contribute to increasing the knowledge base of its practitioners. Yet, practitioners will apply these techniques on  practical examples with a view to how it actually impacts the business.  This means that model evaluation does not solely reside on  pure statistical measures.

Numerous studies exist regarding the use of many different kinds of mathematical techniques.  From an academic standpoint, the arguments supporting the merits of using one technique over the other contribute to increasing the knowledge base of its practitioners. Yet, practitioners will apply these techniques on  practical examples with a view to how it actually impacts the business.  This means that model evaluation does not solely reside on  pure statistical measures. Instead, the practitioner’s key report in assessing model performance is the gains tables or decile charts.

The key benchmark in this report is how well the model rank orders the desired behavior of the predictive analytics solution.  There are two approaches to conducting this evaluation. The first approach is creating a Lorenz curve which plots the actual or observed behaviour of the solution against the deciles. These deciles or groups are determined by the predictive analytics solutions with decile 1 representing the highest scored names and decile 10 representing the lowest scored names.

 The second approach is to create a curve where the cumulative % of the desired behaviour is plotted against each decile where deciles again are created in the same manner as explained above. If the model is completely ineffective, the result would be a straight line upward while if the model is performing well, the line becomes a parabola. The model’s effectiveness  is determined by the difference in area between the parabola and the straight line which can actually be measured by what is referred to as the KS statistic.      

 As practitioners, either one of these tools can be used to evaluate models and to determine appropriate courses of action in terms of model reuse or model rebuild. Evaluating predictive analytics solutions in this manner also allows us to create further business metrics such as  ROI  which all businesses can easily understand.