Worst Practices in Data Mining



I recently read the article Worst practices in business forecasting written by Michael Gilliland and Udo Sglavo. It is published in the July/August issue of AnalyticsMagazine, which is by the way an excellent journal about analytics. In their article, the authors are looking for the reasons why forecasts are sometimes completely wrong. According to them, there are four main reasons:

  • Unsound software
  • Untrained, unskilled, inexperienced or unmotivated forecasters
  • Political contamination
  • Unforecastable behavior

I particularly like a few sentences from the article, which really point out important issues in data mining:

No software, no matter how powerful, and no analyst, no matter how talented, can guarantee perfect (or even highly accurate) forecasts.

Forecast accuracy is ultimately limited by the nature of the behavior being forecast.

Another interesting point is the inappropriate performance objectives mentioned by the authors. It is inappropriate to set an overall objective (in classification accuracy), that would fit any data mining problem. This is strongly related to the post What is a good classification accuracy in data mining?, published a few weeks ago on Data Mining Research.

To read the full article: Worst practices in business forecasting (it may take some time to load the page, be patient)