While for Microsoft the road to embrace R can potentially be bumpy, it might still prove to be the way to go, if not the only, in order to foresee a bright future in the predictive analytics market. Much work perhaps will need to be done, including rewriting and optimizing but, at the end of the day, it might be a movement that could catapult Microsoft to compete in better shape in the predictive analytics market before it is too late.[read more]
Lots of data does not necessarily equate to “Big Data." To my way of thinking, the single most important capability to implement in any large scale data platform that is going to support sophisticated analytics is the ability to quickly construct, high quality random samples.[read more]
If you're laying down a friendly bet on the March Madness games or just tweaking your fantasy roster, this NCAA Data Visualizer by Rodrigo Zamith will be a boon. Just choose two teams to compare head-to-head, and choose an attribute to compare them on.[read more]
Back when my friends and I lived in different parts of Paris, it was tricky to find a mutually agreeable place to meet, so that we'd all be taking an approximately equally long Métro ride. If only we'd had Jean-Robert's Metro Meeting Point app, the decision would have been an easy one.[read more]
This year with both Udacity and Harvard and MIT-backed edX offering interesting and challenging courses, the growth of MOOC enrollment must be astounding indeed. Then again, while MOOC courses are “free,” for a working professional they not without opportunity costs.[read more]
The Washington Post reports that by analyzing more than 10 million emails sent through the Yahoo! Mail service in 2012, a team of researchers used the R language to create a map of countries whose citizens email each other most frequently.[read more]
SnapLogic, a provider of data integration in the cloud, last week announced Big Data-as-a-Service to address businesses’ needs to integrate and process data across Hadoop big data environments. I look forward to seeing SnapLogic’s 2013 technology advancements.[read more]
Tired of manually running a python script to scrape the latest bookmaker odds on the next Pope, R user AJ (an analytical research manager at a large healthcare company) instead created an R script to track the odds on the Papal successor.[read more]
Uri Laserson has created an excellent guide to resampling from a large data set in Hadoop. Resampling is an important step in fitting ensemble models (including random forests and other bagging techniques), and Uri provides a step-by-step guide to resampling with RHadoop.[read more]
After the elections of the 14th of October in Belgium, media reported several cases of candidates who had received more preference votes than normal. Using my data science skills and tools, I tested whether there was a faulty "Touch Screen Effect."[read more]
If people had the ability to collect data on a daily basis (see Quantified Self) and then analyze them on a massive scale, several unknown patterns that call for closer investigation could emerge. I used a smartphone to capture data about my health, and then used analytics to process that data.[read more]
The moderated business community for business intelligence, predictive analytics, and data professionals.
|How do you innovate effectively and maintain a competive edge?|
Learn how in our exlcusive ebook, "Bad Data Need Not Apply: Designing the Modern Data Warehouse Environment."