Data Mining to Raise Questions

October 7, 2011

qm2At first, one can think of data mining as a way to ans

qm2At first, one can think of data mining as a way to answer questions. This is one manner of using data mining. Below are examples of questions:

  • Which of my customers are most likely to churn next month?
  • Which are the most important parameters to predict the weather of tomorrow?
  • Are there groups (clusters) among my clients?
  • What is the probability of this image to contain a human face?

Another way of leveraging your databases is to use data mining to raise questions or highlight strange facts. This may be the continuation of a question answered by data mining.

For example, correlation tells you that p1 and p2 are linked at 0.8. This is an answer. The forthcoming question, to the expert in the field, is the following: why are p1 and p2 highly correlated? The same kind of situations appears for outlier detection. It tells you that account X is an outlier among the dataset. This answers the question: which accounts are outliers? The question it raises is: why is this account an outlier? This highlight a strange case that should be studied further. As a conclusion, don’t forget to use data mining to raise questions. It may have as much value as answering questions.