Defining "Data Scientist", cont'd
Forbes continues the series exploring the definition of "data scientist" with an in-depth interview with bitly's Hilary Mason, who describes "data science" thus:
“I think of ‘data science’ as a flag that was planted at the intersection of several different disciplines that have not always existed in the same place,” Mason says. “Statistics, computer science, domain expertise, and what I usually call ‘hacking,’ though I don’t mean the ‘evil’ kind of hacking. I mean the ability to take all those statistics and computer science, mash them together and actually make something work.”
This is a theme that Hilary explored in detail in an excellent talk at DropBox HQ in San Francisco (which I was lucky enough to catch in person):
In the Forbes interview, Hilary gives a great example of the importance of the data scientist's principle of incorporating data from disctinct sources in analysis: Netflix's movie-recommendation engine:
“Netflix recommendations are good, but not great,” she says. “Netflix only knows about the universe of things you have watched on Netflix. So if Netflix algorithms could know about everything you see in your life—all the media you’ve seen, all the books you’ve read, all the articles you read, the music you listen to—the recommendations would be much better. But the data that that algorithm has explored is just a tiny component of the whole problem, and I think that that’s true for most of the problems we try and solve with data, particularly as they relate to business. The machine might explore only one dimension, and so it’s really important to have a human contextualize it and understand what it really means.”
There are many more great insights into data science in the full interview, which you can read at the link below.
Other Posts by David Smith
The moderated business community for business intelligence, predictive analytics, and data professionals.
|How do you innovate effectively and maintain a competive edge?|
Learn how in our exlcusive ebook, "Bad Data Need Not Apply: Designing the Modern Data Warehouse Environment."