The Data Scientist Team

August 30, 2013
22 Views

20130826DataScientistTeam

20130826DataScientistTeam

I’ve been intrigued with all of the attention that the world of Data Science has received.  It seems that every popular business magazine has published several articles and it’s become a mainstream topic at most industry conferences. One of the things that struck me as odd is that there’s a group of folks that actually believe that all of the activities necessary to deliver new business discoveries with data science can be reasonably addressed by finding individuals that have a cornucopia of technical and business skills.  One popular belief is that a Data Scientist should be able to address all of the business and technical activities necessary to identify, qualify, prove, and explain a business idea with detailed data.

If you can find individuals that comprehend the peculiarities of source data extraction, have mastered data integration techniques, understand parallel algorithms to process tens of billions of records, have worked with specialized data preparation tools, and can debate your company’s business strategy and priorities – Cool!  Hire these folks and chain their leg to the desk as soon as possible.

If you can’t, you might consider building a team that can cover the various roles that are necessary to support a Data Science initiative. There’s a lot more to Data Science than simply processing a pile of data with the latest open source framework.  The roles that you should consider include:

Data Services

Manages the various data repositories that feed data to the analytics effort.  This includes understanding the schemas, tracking the data content, and making sure the platforms are maintained. Companies with existing data warehouses, data marts, or reporting systems typically have a group of folks focused on these activities (DBAs, administrators, etc.).

Data Engineer

Responsible for developing and implementing tools to gather, move, process, and manage data. In most analytics environments, these activities are handled by the data integration team.  In the world of Big Data or Data Science, this isn’t just ETL development for batch files; it also includes processing data streams and handling the cleansing and standardization of numerous structured and unstructured data sources.

Data Manager

Handles the traditional data management or source data stewardship role; the focus is supporting development access and manipulation of data content. This includes tracking the available data sources (internal and external), understanding the location and underlying details of specific attributes, and supporting developers’ code construction efforts.

Production Development

Responsible for packaging the Data Scientist discoveries into a production ready deliverable. This may include (one or) many components: new data attributes, new algorithms, a new data processing method, or an entirely new end-user tool. The goal is to ensure that the discoveries deliver business value.

Data Scientist

The team leader and the individual that excels at analyzing data to help a business gain a competitive edge. They are adept at technical activities and equally qualified to lead a business discussion as to the benefits of a new business strategy or approach. They can tackle all aspects of a problem and often lead the interdisciplinary team to construct an analytics solution.

There’s no shortage of success stories about the amazing data discoveries uncovered by Data Scientists.  In many of those companies, the Data Scientist didn’t have an incumbent data warehousing or analytics environment; they couldn’t pick up the phone to call a data architect, there wasn’t any metadata documentation, and their company didn’t have a standard set of data management tools.  They were on their own.  So, the Data Scientist became “chief cook and bottle washer” for everything that is big data and analytics.

Most companies today have institutionalized data analysis; there are multiple data warehouses, lots of dashboards, and even a query support desk.  And while there’s a big difference between desktop reporting and processing social media feedback, much of the “behind the scenes” data management and data integration work is the same.  If your company already has an incumbent data and analytics environment, it makes sense to leverage existing methods, practices, and staff skills.  Let the Data Scientists focus on identifying the next big idea and the heavy analytics; let the rest of the team deal with all of the other work.

You may be interested

Is Big Data the Salvation of the Newspaper Industry?
Analytics
0 shares732 views
Analytics
0 shares732 views

Is Big Data the Salvation of the Newspaper Industry?

Rehan Ijaz - May 27, 2017

The newspaper industry has been declining for the past decade. In 2007, Paul Gillan, a former reporter, launched the website…

Big Data is the Key to the Future of Multi-Device Marketing
Big Data
0 shares769 views
Big Data
0 shares769 views

Big Data is the Key to the Future of Multi-Device Marketing

Ryan Kh - May 26, 2017

Digital marketers must reach customers across multiple devices. According to Criteo Mobile eCommerce Report, 40% of all online transactions involve…

Empowering Partners and Customers with Data Insights: A Win-Win for Everyone
Analytics
0 shares620 views
Analytics
0 shares620 views

Empowering Partners and Customers with Data Insights: A Win-Win for Everyone

Guy Greenberg - May 26, 2017

All businesses in the digital age rely on analytics for various activities: Product managers rely on analytics to gain insights…