Micro vs. Macro Information Retrieval
The Probably Irrelevant blog has been quiet for a while, but I was happy to see a new post there…
The Map is not the Territory
“The word is not the thing, the map is not the territory” is a key principle of General Semantics and…
Beyond the Buzz: The Quiet Thunder of Active Data Warehousing
That cicada buzzing sound you hear in the high-tech media around data appliances may lead some to believe that this…
Indicators & KPIs
In a recent Wired magazine article “American Vice: Mapping the 7 Deadly Sins” (the original was in the Las Vegas…
Fantasy League Data Quality
For over 25 years, I have been playing fantasy league baseball and football. For those readers who are not familiar…
Machine Learning in R, in a nutshell
Josh Reich has created a concise R script demonstrating various machine-learning techniques in R with simple, self-contained examples. For example,…
Interactive stock visualizations with R
Jeroen Ooms, who recently completed his Masters in Statistics at Utrech University, has created an outstanding web-based drag-and-drop application for…
#21: Here’s a thought…
An occasional series in which a review of recent posts on SmartData Collective reveals the following nuggets:They just don’t get…
HCIR: Better Than Magic!
I’m a big fan of using machine learning and automated information extraction to improve search performance and generally support information…
To Parse or Not To Parse
“To Parse, or Not To Parse,—that is the question: Whether 'tis nobler in the data to suffer The slings and…