By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    data-driven white label SEO
    Does Data Mining Really Help with White Label SEO?
    7 Min Read
    marketing analytics for hardware vendors
    IT Hardware Startups Turn to Data Analytics for Market Research
    9 Min Read
    big data and digital signage
    The Power of Big Data and Analytics in Digital Signage
    5 Min Read
    data analytics investing
    Data Analytics Boosts ROI of Investment Trusts
    9 Min Read
    football data collection and analytics
    Unleashing Victory: How Data Collection Is Revolutionizing Football Performance Analysis!
    4 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-23 SmartData Collective. All Rights Reserved.
Reading: The Fallacy of the Data Scientist Shortage
Share
Notification Show More
Aa
SmartData CollectiveSmartData Collective
Aa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Business Intelligence > Business Rules > The Fallacy of the Data Scientist Shortage
AnalyticsBusiness IntelligenceBusiness RulesCloud ComputingCollaborative DataCommentaryData MiningData WarehousingDecision ManagementHadoopJobsMapReducePredictive AnalyticsR Programming LanguageSentiment AnalyticsStatisticsText AnalyticsUnstructured Data

The Fallacy of the Data Scientist Shortage

nraden
Last updated: 2012/04/01 at 7:00 AM
nraden
8 Min Read
SHARE

 

There is no question that the USA (in fact, most of the world) would be well-served with more quantitatively capable people to work in business and government. However, the current hysteria over the shortage of data scientists is overblown. To illustrate why, I am going to use an example from air travel.

 

On a recent trip from Santa Fe, NM to Phoenix, AZ, I tracked the various times:

More Read

business systems for data driven businesses

Business Management Systems for Data-Driven Businesses

Harnessing the Power of Analytics For Direct-to-Consumer Businesses
The Role of Data Analytics in Football Performance
5 Ways Layered Navigation Improves Business Intelligence Strategies
Embedded BI Tools Bring Huge Benefits to Business Applications

 

 

 

There is no question that the USA (in fact, most of the world) would be well-served with more quantitatively capable people to work in business and government. However, the current hysteria over the shortage of data scientists is overblown. To illustrate why, I am going to use an example from air travel.

 

On a recent trip from Santa Fe, NM to Phoenix, AZ, I tracked the various times:

 

 

Duration (min)

Cumulative (min)

Drive from Santa Fe to ABQ Airport

65

65

Park

15

80

Security

25

105

Wait to board

20

125

Boarding process

30

155

Taxiing

15

170

In flight

60

230

Taxiing

12

242

Deplane

9

251

Wait for valet bag

7

258

Travel to rental car

21

279

Arrive at destination in Tempe

32

311

 

 

As you can see, the actual flying time of 60 minutes represents only 19% of the travel time.  Because everything but the actual flight time is more or less constant for any domestic trip (disregarding common delays, connections and cancellations which would skew this analysis even farther), this low percentage of time in the air is a reality. For example, if the flight took 2 hours and fifteen minutes, it would still work out to 135/386 = 35%. The most recent data I have, from 2005, shows the average non stop distance flown per departure was 607 miles, so we can add about 25 minutes to the first calculation and arrive at 85/336 =  25%.

 

Keep in mind, again, these calculations do not account for late departures/arrivals, cancelled and re-booked flights, connections, flight attendants and pilots having nervous breakdowns, etc. It’s safe to say that at most 25% of your travel time is spent in the air. Just for fun, let’s see how this would work out if we could take the (unfortunately retired) Concorde.  We would reduce our travel time by flying at Mach 2.5 by 40 minutes, trimming out journey from five hours and eleven minutes to four hours and 31 minutes, about a 13% improvement.

 

What’s the point of all of this and what does it have to do with the so-called data scientist shortage?

 

Based on our research at Constellation Research, we find that analysts that work with Hadoop or other big data technologies spend a significant amount of time NOT requiring any knowledge of advanced quantitative methods – configuring and maintaining clusters, writing programs to gather, move, cleanse and otherwise organize data for analysis and many other common tasks in data analysis. In fact, even those who employ advanced quantitative techniques spend from 50-80% of their time gathering, cleansing and preparing data. This percentage has not budged in decades. Keep in mind that advanced analytics is not a new phenomenon; what is new is the volume (to some extent) and variety of the source data with new techniques to deal with it, especially, but not limited to, Hadoop.

 

The interest in analytics has risen dramatically in the past two or three years,  that is not in dispute. But the adoption of enterprise-scale analytics with big data is not guaranteed in most organizations beyond some isolated areas of expertise. Most of the activity is in predictable (commercial) industries – net-based businesses, financial services, and telecommunications, for example, but these businesses have employed very large-scale analytics, at the bleeding edge of technology for decades.  For most organizations, analytics will be provided by embedded algorithms in applications not developed in-house and third-party vendors of tools and services and consultants.

 

The good news is that 80% of the expertise you need for big data is readily available. The balance can be sourced and developed.  “The crème-de-la-crème of data scientists will fill roles in academia, technology vendors, Wall Street, research and government.

 

There are related and unrelated disciplines that are all combined under the term analytics. There is advanced analytics, descriptive analytics, predictive analytics and business analytics, all defined in a pretty murky way. It cries out for some precision. Here is how I characterize the many types of analytics by the quantitative techniques used and the level of skill of the practitioners who use these techniques.

 

 

Descriptive Title

Quantitative Sophistication/Numeracy

Sample Roles

Type I

Quantitative Research (True Data Scientist)

PhD or equivalent

Creation of theory, development of algorithms. Academic/research. Often employed in business or government for very specialized roles

Type II

(Current definition of) Data Scientist or Quantitative Analyst

Advanced Math/Stat, not necessarily PhD

Internal expert in statistical and mathematical modeling and development, with solid business domain knowledge

Type III

Operational Analytics

Good business domain, background in statistics optional

Running and managing analytical models. Strong skills in and/or project management of analytical systems implementation

Type IV

Business Intelligence/ Discovery

Data and numbers oriented, but no special advanced statistical skills

Reporting, dashboard, OLAP and visualization use, possibly design, Performing posterior analysis of results driven by quantitative methods

 

 “Data Scientist” is a relatively new title for quantitatively adept people with accompanying business skills. The ability to formulate and apply tools to classification, prediction and even optimization, coupled with fairly deep understanding of the business itself, is clearly in the realm of Type II efforts. However, it seems pretty likely that most so-called data scientists will lean more towards the quantitative and data-oriented subjects than business planning and strategy. The reason for this is that the term data scientist emerged from those businesses like Google or Facebook where the data is the business; so understanding the data is equivalent to understanding the business. This is clearly not the case for most organizations. We see very few Type II data scientists with the in-depth knowledge of the whole business as, say, actuaries in the insurance business, whose extensive training should be a model for the newly designated data scientists (see my blogs “Who Needs Analytics PhD’s? Grow Your Own” and “What is a Data Scientist and What Isn’t.”)

 

 

 

 

 

 

TAGGED: analytics, business intelligence, business rules, cloud, data mining, Data Scientist, hadoop, MapReduce, neil raden, optimization, Predictive
nraden April 1, 2012
Share This Article
Facebook Twitter Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

iot and cloud technology
IoT And Cloud Integration is the Future!
Internet of Things
ai in marketing
4 Ways AI Can Improve Your Marketing Strategy
Artificial Intelligence
data security unveiled
Data Security Unveiled: Protecting Your Information in a Connected World
Security
it management for data-driven businesses
7 Major IT Infrastructure Challenges for Data-Driven Companies
IT

Stay Connected

1.2k Followers Like
33.7k Followers Follow
222 Followers Pin

You Might also Like

business systems for data driven businesses
Big Data

Business Management Systems for Data-Driven Businesses

9 Min Read
power of analytics
Analytics

Harnessing the Power of Analytics For Direct-to-Consumer Businesses

6 Min Read
football analytics
AnalyticsBig DataExclusive

The Role of Data Analytics in Football Performance

9 Min Read
layered navigation for business intelligence
Business Intelligence

5 Ways Layered Navigation Improves Business Intelligence Strategies

5 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

giveaway chatbots
How To Get An Award Winning Giveaway Bot
Big Data Chatbots Exclusive
AI and chatbots
Chatbots and SEO: How Can Chatbots Improve Your SEO Ranking?
Artificial Intelligence Chatbots Exclusive

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Lost your password?