By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData Collective
  • Analytics
    AnalyticsShow More
    data science anayst
    Growing Demand for Data Science & Data Analyst Roles
    6 Min Read
    predictive analytics in dropshipping
    Predictive Analytics Helps New Dropshipping Businesses Thrive
    12 Min Read
    data-driven approach in healthcare
    The Importance of Data-Driven Approaches to Improving Healthcare in Rural Areas
    6 Min Read
    analytics for tax compliance
    Analytics Changes the Calculus of Business Tax Compliance
    8 Min Read
    big data analytics in gaming
    The Role of Big Data Analytics in Gaming
    10 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-23 SmartData Collective. All Rights Reserved.
Reading: The Fallacy of the Data Scientist Shortage
Share
Notification Show More
Latest News
ai in automotive industry
AI Is Changing the Automotive Industry Forever
Artificial Intelligence
SMEs Use AI-Driven Financial Software for Greater Efficiency
Artificial Intelligence
data security in big data age
6 Reasons to Boost Data Security Plan in the Age of Big Data
Big Data
data science anayst
Growing Demand for Data Science & Data Analyst Roles
Data Science
ai software development
Key Strategies to Develop AI Software Cost-Effectively
Artificial Intelligence
Aa
SmartData Collective
Aa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Business Intelligence > Business Rules > The Fallacy of the Data Scientist Shortage
AnalyticsBusiness IntelligenceBusiness RulesCloud ComputingCollaborative DataCommentaryData MiningData WarehousingDecision ManagementHadoopJobsMapReducePredictive AnalyticsR Programming LanguageSentiment AnalyticsStatisticsText AnalyticsUnstructured Data

The Fallacy of the Data Scientist Shortage

nraden
Last updated: 2012/04/01 at 7:00 AM
nraden
8 Min Read
SHARE

 

There is no question that the USA (in fact, most of the world) would be well-served with more quantitatively capable people to work in business and government. However, the current hysteria over the shortage of data scientists is overblown. To illustrate why, I am going to use an example from air travel.

 

On a recent trip from Santa Fe, NM to Phoenix, AZ, I tracked the various times:

More Read

embedded bi tools

Embedded BI Tools Bring Huge Benefits to Business Applications

The Importance of Data-Driven Approaches to Improving Healthcare in Rural Areas
Data Mining Technology Helps Online Brands Optimize Their Branding
What Role Does Big Data Have on the Deep Web?
How IoT Can Be Connected to Business Intelligence

 

 

 

There is no question that the USA (in fact, most of the world) would be well-served with more quantitatively capable people to work in business and government. However, the current hysteria over the shortage of data scientists is overblown. To illustrate why, I am going to use an example from air travel.

 

On a recent trip from Santa Fe, NM to Phoenix, AZ, I tracked the various times:

 

 

Duration (min)

Cumulative (min)

Drive from Santa Fe to ABQ Airport

65

65

Park

15

80

Security

25

105

Wait to board

20

125

Boarding process

30

155

Taxiing

15

170

In flight

60

230

Taxiing

12

242

Deplane

9

251

Wait for valet bag

7

258

Travel to rental car

21

279

Arrive at destination in Tempe

32

311

 

 

As you can see, the actual flying time of 60 minutes represents only 19% of the travel time.  Because everything but the actual flight time is more or less constant for any domestic trip (disregarding common delays, connections and cancellations which would skew this analysis even farther), this low percentage of time in the air is a reality. For example, if the flight took 2 hours and fifteen minutes, it would still work out to 135/386 = 35%. The most recent data I have, from 2005, shows the average non stop distance flown per departure was 607 miles, so we can add about 25 minutes to the first calculation and arrive at 85/336 =  25%.

 

Keep in mind, again, these calculations do not account for late departures/arrivals, cancelled and re-booked flights, connections, flight attendants and pilots having nervous breakdowns, etc. It’s safe to say that at most 25% of your travel time is spent in the air. Just for fun, let’s see how this would work out if we could take the (unfortunately retired) Concorde.  We would reduce our travel time by flying at Mach 2.5 by 40 minutes, trimming out journey from five hours and eleven minutes to four hours and 31 minutes, about a 13% improvement.

 

What’s the point of all of this and what does it have to do with the so-called data scientist shortage?

 

Based on our research at Constellation Research, we find that analysts that work with Hadoop or other big data technologies spend a significant amount of time NOT requiring any knowledge of advanced quantitative methods – configuring and maintaining clusters, writing programs to gather, move, cleanse and otherwise organize data for analysis and many other common tasks in data analysis. In fact, even those who employ advanced quantitative techniques spend from 50-80% of their time gathering, cleansing and preparing data. This percentage has not budged in decades. Keep in mind that advanced analytics is not a new phenomenon; what is new is the volume (to some extent) and variety of the source data with new techniques to deal with it, especially, but not limited to, Hadoop.

 

The interest in analytics has risen dramatically in the past two or three years,  that is not in dispute. But the adoption of enterprise-scale analytics with big data is not guaranteed in most organizations beyond some isolated areas of expertise. Most of the activity is in predictable (commercial) industries – net-based businesses, financial services, and telecommunications, for example, but these businesses have employed very large-scale analytics, at the bleeding edge of technology for decades.  For most organizations, analytics will be provided by embedded algorithms in applications not developed in-house and third-party vendors of tools and services and consultants.

 

The good news is that 80% of the expertise you need for big data is readily available. The balance can be sourced and developed.  “The crème-de-la-crème of data scientists will fill roles in academia, technology vendors, Wall Street, research and government.

 

There are related and unrelated disciplines that are all combined under the term analytics. There is advanced analytics, descriptive analytics, predictive analytics and business analytics, all defined in a pretty murky way. It cries out for some precision. Here is how I characterize the many types of analytics by the quantitative techniques used and the level of skill of the practitioners who use these techniques.

 

 

Descriptive Title

Quantitative Sophistication/Numeracy

Sample Roles

Type I

Quantitative Research (True Data Scientist)

PhD or equivalent

Creation of theory, development of algorithms. Academic/research. Often employed in business or government for very specialized roles

Type II

(Current definition of) Data Scientist or Quantitative Analyst

Advanced Math/Stat, not necessarily PhD

Internal expert in statistical and mathematical modeling and development, with solid business domain knowledge

Type III

Operational Analytics

Good business domain, background in statistics optional

Running and managing analytical models. Strong skills in and/or project management of analytical systems implementation

Type IV

Business Intelligence/ Discovery

Data and numbers oriented, but no special advanced statistical skills

Reporting, dashboard, OLAP and visualization use, possibly design, Performing posterior analysis of results driven by quantitative methods

 

 “Data Scientist” is a relatively new title for quantitatively adept people with accompanying business skills. The ability to formulate and apply tools to classification, prediction and even optimization, coupled with fairly deep understanding of the business itself, is clearly in the realm of Type II efforts. However, it seems pretty likely that most so-called data scientists will lean more towards the quantitative and data-oriented subjects than business planning and strategy. The reason for this is that the term data scientist emerged from those businesses like Google or Facebook where the data is the business; so understanding the data is equivalent to understanding the business. This is clearly not the case for most organizations. We see very few Type II data scientists with the in-depth knowledge of the whole business as, say, actuaries in the insurance business, whose extensive training should be a model for the newly designated data scientists (see my blogs “Who Needs Analytics PhD’s? Grow Your Own” and “What is a Data Scientist and What Isn’t.”)

 

 

 

 

 

 

TAGGED: analytics, business intelligence, business rules, cloud, data mining, Data Scientist, hadoop, MapReduce, neil raden, optimization, Predictive
nraden April 1, 2012
Share this Article
Facebook Twitter Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

ai in automotive industry
AI Is Changing the Automotive Industry Forever
Artificial Intelligence
SMEs Use AI-Driven Financial Software for Greater Efficiency
Artificial Intelligence
data security in big data age
6 Reasons to Boost Data Security Plan in the Age of Big Data
Big Data
data science anayst
Growing Demand for Data Science & Data Analyst Roles
Data Science

Stay Connected

1.2k Followers Like
33.7k Followers Follow
222 Followers Pin

You Might also Like

embedded bi tools
Business Intelligence

Embedded BI Tools Bring Huge Benefits to Business Applications

5 Min Read
data-driven approach in healthcare
Analytics

The Importance of Data-Driven Approaches to Improving Healthcare in Rural Areas

6 Min Read
data mining
Data Mining

Data Mining Technology Helps Online Brands Optimize Their Branding

7 Min Read
big data technology has helped improve the state of both the deep web and dark web
Big Data

What Role Does Big Data Have on the Deep Web?

8 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

data-driven web design
5 Great Tips for Using Data Analytics for Website UX
Big Data
ai is improving the safety of cars
From Bolts to Bots: How AI Is Fortifying the Automotive Industry
Artificial Intelligence

Quick Link

  • About
  • Contact
  • Privacy
Follow US

© 2008-23 SmartData Collective. All Rights Reserved.

Removed from reading list

Undo
Go to mobile version
Welcome Back!

Sign in to your account

Lost your password?