The Netflix Prize and Freeing Data Analytics

September 24, 2009
180 Views

This week, Netflix presented the first Netflix Prize, awarding the Belkour Pragmatic Chaos team the $1 Million grand prize in a ceremony in New York. The most exciting news is that Netflix announced a second round of the Netflix Prize using demographics and other data instead of movie ratings. 

There are a lot of great articles about the contest, the winners and the impact on Netflix. This post is not about rehashing the contest. I think it was a masterstroke by Netflix to open up its data sets and harness the power of the net to drive innovation. 

There are more contests coming, which is another great thing. And I like the format of the new Netflix contest better – a 6 month interval then an 18 month interval. This starts to approach reasonable pay-back time frames for companies looking to make an investment, compared to the 3 year process that revolved around the first contest. 

I was testing the visualization program Tableau last year and I used a data set from Sean Lahman’s Baseball Archive. What was cool was that using salary data I was able to show that the hated yankees of new york have spent more money on salaries since their last world series win (which was

This week, Netflix presented the first Netflix Prize, awarding the Belkour Pragmatic Chaos team the $1 Million grand prize in a ceremony in New York. The most exciting news is that Netflix announced a second round of the Netflix Prize using demographics and other data instead of movie ratings. 

There are a lot of great articles about the contest, the winners and the impact on Netflix. This post is not about rehashing the contest. I think it was a masterstroke by Netflix to open up its data sets and harness the power of the net to drive innovation. 

There are more contests coming, which is another great thing. And I like the format of the new Netflix contest better – a 6 month interval then an 18 month interval. This starts to approach reasonable pay-back time frames for companies looking to make an investment, compared to the 3 year process that revolved around the first contest. 

I was testing the visualization program Tableau last year and I used a data set from Sean Lahman’s Baseball Archive. What was cool was that using salary data I was able to show that the hated yankees of new york have spent more money on salaries since their last world series win (which was sometime last century – that’s right the last century when radio was popular and people watched silent movies) than my beloved Red Sox did in 78 years of frustration. That’s right the yankees have blown more than a billion dollars and have won zip. Nada. Nothing. Meanwhile my children lived the blessed life of only experiencing this rapturous time of Red SoxRed_Sox_Yankees domination and Yankee futility.

But my point is that bringing that data set allowed me to test the Tableau software, find interesting ways of investigating visualization in a matter of minutes. We are moving to a time when data analytics need to be freed from proprietary data. 

Teradata has been winning deals through our benchmark center for years. Proving our scalability and speed with complex problems our customers face. Often, their queries will not even run on their existing infrastructure. For IT this is really important, but for the business user, speed and performance are not the only thing.

I think there maybe a challenge for our customers to provide some open, anonymous datasets so that we can help them not just with performance but also with improving the quality of the analytics they get. Obviously there are privacy and security constraints that need to be taken into account. But I would love to get input from readers about how we could facilitate benchmarking the quality of analytics – not just the performance aspect. 

Our team has been working specifically with the integration of online and offline data. If there are customers out there that would be willing to provide data sets for analytics including web visitor data, online advertising, search marketing, social media or other areas we would be excited to hear from you and work with you on developing new insights from this data. With our Integrated Web Intelligence analytical assets and partners such as KXEN, Optimine, Webtrends and Microstrategy we may be able to provide new analytics that transform your multi-channel marketing.

 

You can reach me through this blog, or at paul.barrett@teradata.com 

Shameless self promotion: Check out Webtrends Blog for a guest blog post.

 

 

Paul Barrett

Link to original post