Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    data analytics
    How Data Analytics Can Help You Construct A Financial Weather Map
    4 Min Read
    financial analytics
    Financial Analytics Shows The Hidden Cost Of Not Switching Systems
    4 Min Read
    warehouse accidents
    Data Analytics and the Future of Warehouse Safety
    10 Min Read
    stock investing and data analytics
    How Data Analytics Supports Smarter Stock Trading Strategies
    4 Min Read
    predictive analytics risk management
    How Predictive Analytics Is Redefining Risk Management Across Industries
    7 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: Strata 2012: Big Data is Bigger than Ever!
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Uncategorized > Strata 2012: Big Data is Bigger than Ever!
Uncategorized

Strata 2012: Big Data is Bigger than Ever!

Daniel Tunkelang
Daniel Tunkelang
7 Min Read
SHARE

I spent the last three days at the O’Reilly Strata Conference, an assembly of two thousand people focused on data science and its applications. While I’m wary of industry conferences from attending vendor-fests in my past life in the enterprise software world, Strata is an exceptionally good conference. The speakers were a who’s who of data science, including Lucene and Hadoop creator Doug Cutting, search user interface pioneer Marti Hearst, and Google chief economist Hal Varian. You can find the tweet stream for the conference at hash tag #strataconf.

More Read

Another view: SOA won’t trump ‘toxic human behavior’
What Does the Regression Model Indicate?
Social Media: The Great Equalizer
The Demise of the Data Scientist: Heresy or Fact?
CTO Perspectives on Cyber Security Bill

Tuesday

I spent Tuesday in the Deep Data session, billed as a no-holds-barred program for data scientists. My two favorite talks:

  • Claudia Perlich, winner of three KDD cups, talked about using information to pick the right action and to influence people such that they behave in a way that is better for them, better for us, and possibly better for society in general.
  • Monica Rogati, my colleague at LinkedIn and the epitome of a data scientist, delivered a fantastic talk about machine learning models and training data in the real world, extending Peter Norvig‘s point about the “unreasonable effectiveness of data” to observe that more data beats clever algorithms but better data beats more data.

But the most fun that day was the Oxford-style debate featuring Pete Skomoroch, Mike Driscoll, DJ Patil, Amy Heineike, Pete Warden, and Toby Segaran. The question proposed was absurdly Manichean: if you had to hire your first data scientist and could only hire one, would you pick a domain expert or a machine learning expert? After the moderator suppressed some initial attempts to hedge (“both”, “it depends”, etc.), the debaters ripped into the question by taking extreme positions and defending them with gusto. It was a lot of fun, with enthusiastic audience participation and the debaters exploiting their inside knowledge of their opponents’ work histories. In the end, the machine learning side won by a small margin.

I then had the good fortune to grab dinner with Marti Hearst and Hal Varian at Xanh – a wonderful mix of great food and conversation.

Wednesday

The Wednesday morning keynote session offered some gems:

  • Cloudera CEO Mike Olson urged big data practitioners to focus on guns, drugs, and oil.
  • Doctor and data geek Ben Goldacre delivered a mesmerizing and disturbing talk about the suppression of inconvenient medical trial results and analytical tools to discover it.

But the person who stole the show was Google’s Avinash Kaushik, who talked about making love with data to find orgasm-inducing actions to change the world and make more money. Unfortunately this was the one talk that was not recorded, but you can read the summary on Avinash’s Google+ page.

As a speaker, I held “office hours” on Wednesday. It was supposed to be a 40-minute slot for conference attendees to come and ask me question. But somehow those 40 minutes extended into three hours of conversation about everything from normalized KL divergence to interview problems — and segued into a reception with specialty big-data cocktails. By the time I got back to my apartment, my voice, brain, and liver were spent.

Thursday

I spent most of Thursday morning in the speaker lounge, recovering from the previous evening and making last touches on my presentation. But I couldn’t resist attending a two-part session on privacy. Indeed, this session was distinctive enough to merits it’s own hash tag: #strataprivacy.

The first part featured O’Reilly’s Alex Howard moderating Intelius Chief Privacy Officer Jim Adler and NYU PhD student Solon Barocas on a panel provocatively titled  ”If Data Wants to Be Free, is Privacy a Prison?” It was a great discussion, and I enjoyed the opportunity to offer my own provocative question through Twitter. Since the panelists were arguing that it was unethical to infer private facts from public data, I asked if they were trying to establish a new form of thoughtcrime.

The second panel, entitled “Pretty Simple Data Privacy“, featured Kaitlin Thaney from Digital Science, Betsy Masiello from Google, and John Wilbanks from the Kauffman Foundation for Entrepreneurship. Given that today was the first day of Google’s new privacy policy, there was no avoiding focus on the associated controversy. I did try to get Betsy to address my charge that Google doesn’t think users own their search history (cf. “Google vs. Bing: A Tweetle Beetle Battle Muddle“), but she said she was unfamiliar with the details of that event. I do wish that someone at Google with more familiarity would respond publicly.

Back to the speaker room after lunch, until my own talk with Samasource’s Claire Hunsaker on “Humans, Machines, and the Dimensions of Microwork“. I’ll post the slides (and there will be a video on the conference site), but the sound bite is that you need to keep crowdsourcing tasks simple, manage the trade-off between task value and difficulty, and watch out for systematic bias.

I wrapped up the conference by hearing William Gunn talk about how Mendeley is disrupting bibliometrics and perhaps the entire academic publishing and reputation ecosystem. I laud his ambition and wish him and Mendeley luck in this quest.

 

In summary, three days of great talks, conversations, and general enjoyment. My thanks to Strata organizers Edd Dumbill and Alistair Croll for putting together such an outstanding event and for giving me the opportunity to participate.

 

Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

protecting patient data
How to Protect Psychotherapy Data in a Digital Practice
Big Data Exclusive Security
data analytics
How Data Analytics Can Help You Construct A Financial Weather Map
Analytics Exclusive Infographic
AI use in payment methods
AI Shows How Payment Delays Disrupt Your Business
Artificial Intelligence Exclusive Infographic
financial analytics
Financial Analytics Shows The Hidden Cost Of Not Switching Systems
Analytics Exclusive Infographic

Stay Connected

1.2KFollowersLike
33.7KFollowersFollow
222FollowersPin

You Might also Like

Mobile App Development: How to Choose Between Native vs. Web vs. Hybrid

6 Min Read

Recommended read: Seizing the White Space

4 Min Read

Cash-hungry state agencies finding megamillions in revenue by drilling for money with Teradata

4 Min Read

Beating The Placebo Effect: Red Pill or Blue?

7 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

AI chatbots
AI Chatbots Can Help Retailers Convert Live Broadcast Viewers into Sales!
Chatbots
ai in ecommerce
Artificial Intelligence for eCommerce: A Closer Look
Artificial Intelligence

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?