Statistical Analysis and Data Mining

November 21, 2009
143 Views

I don’t keep many actual books next to my desk these days. I have found that my hard drive has become my main knowledge repository. For those interested, everything I receive online (email, documents, spreadsheets, video, research papers, etc.) is feed into my knowledgebase using Devonthink.

A rare exception to this is a a new book that has really impressed me: Handbook of Statistical Analysis and Data Mining Applications by Robert Nisbet, John Elder IV, and Gary Miner. 20091120 Handbook of Statistical AnalysisAvailable on Amazon for about AUD80.

Why has this 800+ page book squeezed its way onto my crowded desk? It’s useful to a part-time data miner whose post-graduate maths and stats courses are in the dim and distant 1990s. I have found it useful in a number of ways:

  • Reference Guide. Section II is a lexicon of the algorithms used in structured and unstructured (i.e. text) data mining.
  • Problem Solving. Section III is a substantial how-to guide of the data mining in practise. The 13 tutorials cover a wide range of problems and industries/fields.
  • Mentoring. Section I is a great primer for people new to the field. I would use it to help any analyst who joins one of my teams.

I haven’t yet made use of Section IV



I don’t keep many actual books next to my desk these days. I have found that my hard drive has become my main knowledge repository. For those interested, everything I receive online (email, documents, spreadsheets, video, research papers, etc.) is feed into my knowledgebase using Devonthink.

A rare exception to this is a a new book that has really impressed me: Handbook of Statistical Analysis and Data Mining Applications by Robert Nisbet, John Elder IV, and Gary Miner. 20091120 Handbook of Statistical AnalysisAvailable on Amazon for about AUD80.

Why has this 800+ page book squeezed its way onto my crowded desk? It’s useful to a part-time data miner whose post-graduate maths and stats courses are in the dim and distant 1990s. I have found it useful in a number of ways:

  • Reference Guide. Section II is a lexicon of the algorithms used in structured and unstructured (i.e. text) data mining.
  • Problem Solving. Section III is a substantial how-to guide of the data mining in practise. The 13 tutorials cover a wide range of problems and industries/fields.
  • Mentoring. Section I is a great primer for people new to the field. I would use it to help any analyst who joins one of my teams.

I haven’t yet made use of Section IV of the book (Measuring True Complexity, the “right model for the right use”, Top Mistakes, and the Future of Analytics), but I know it’s something I should get to.

The book is a practical guide for how to use SAS-Enterprise Miner and STATISTICA Data Miner. There is also a section on SPSS Clementine and sprinkled throughout the book are STATISTICA’s C&RT, CHAID, MARSpline, and other data mining and graphical analytic tools.

Here’s a link to the table of contents.

I don’t need it every week, but when I do I’m really glad I have it to hand.


Link to original post