Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    data analytics and truck accident claims
    How Data Analytics Reduces Truck Accidents and Speeds Up Claims
    7 Min Read
    predictive analytics for interior designers
    Interior Designers Boost Profits with Predictive Analytics
    8 Min Read
    image fx (67)
    Improving LinkedIn Ad Strategies with Data Analytics
    9 Min Read
    big data and remote work
    Data Helps Speech-Language Pathologists Deliver Better Results
    6 Min Read
    data driven insights
    How Data-Driven Insights Are Addressing Gaps in Patient Communication and Equity
    8 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: The Trouble with Big Data
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Uncategorized > The Trouble with Big Data
Uncategorized

The Trouble with Big Data

gilpress
gilpress
6 Min Read
SHARE

Douglas Merrill, former CIO/VP of Engineering at Google, has issued an important warning about big data:

Douglas Merrill, former CIO/VP of Engineering at Google, has issued an important warning about big data:

“With too little data, you won’t be able to make any conclusions that you trust.  With loads of data you will find relationships that aren’t real… On net, having a degree in math, economics, AI, etc., isn’t enough. Tool expertise isn’t enough.  You need experience in solving real world problems, because there are a lot of important limitations to the statistics that you learned in school.  Big data isn’t about bits, it’s about talent.”  

More Read

De-anonymizing Social Networks
Certification ”holiday”
Building Trust with Consumers: Is Disclosure Enough?
Beyond Campaigns
Competing with SocialToo for Paid Twitter Services

What Merrill is warning about and what he means by “talent,” I think, is the danger of blindly falling in love with correlations and not being able to develop a model that explains (or predicts) the relationships found. This is what I think David Smith means when he agrees with Merrill, saying that “this is a great illustration of why the data science process is a valuable one for extracting information from Big Data, because it combines tool expertise with statistical expertise and the domain expertise required to understand the problem and the data applicable to it.”

The “talent” of “understanding the problem and the data applicable to it” is what makes a good scientist: The required skepticism, the development of hypotheses (models), and the un-ending quest to refute them, following the scientific method that has brought us remarkable progress over the course of the last three hundred and fifty years.  

But this tradition is threatened by the excitement—too much excitement?—around big data. According to nextgove.com, Farnam Jahanian, chief of the National Science Foundation’s Computer and Information Science and Engineering Directorate, believes that “Big data has the power to change scientific research from a hypothesis-driven field to one that’s data-driven.” Explains Bill Perlowitz, chief technology officer of Wyle science: “In hypothetical science, you propose a hypothesis, you go out and gather data and you see if your hypothesis is supported. That limits your exploration to what you can imagine. It also limits the number of relationships you can explore because the human mind can only go so far. The shift with data-driven science and big data is that first we collect the data and then we see what it tells us. We don’t have a pretense that we understand what those relationships are, or what information we may find.” 

Sure, just take for example that fella Einstein, who had the “pretense” to speculate about the universe without having any data, big or small, to support his limited imagination…

At least one commentator on the nextgov post is not buying the next-big-thing-in-science, observing that “With Big Data, any clever analyst can find the data set with the right, spurious correlation to prove his bias. How not to fall for the ‘revelation-de-jour’ will be the Bid Data challenge.”

I’m not sure how much this misguided excitement around big data is a clear and present danger to science right now. But the threat to sound business decisions is quite evident and some scholars are fighting back. Technology Review reports that Daniel Gayo-Avello, at the University of Oviedo in Spain, “knocks Twitter’s predictive crown off altogether.” After reviewing the work of researchers who claim that Twitter’s data can predict election results, Gayo-Avello concluded that it is flawed because of the simple fact that “social media is not a representative and unbiased sample of the voting population.”

Gayo-Avello is joined, also on the pages of Technology Review, by Wharton’s Peter Fader, who responds unequivocally to a question about businesses that “promise to take a Twitter stream or a collection of Facebook comments and then make some prediction”:

“That is all ridiculous. If you can get me a really granular view of data—for example, an individual’s tweets and then that same individual’s transactions, so I can see how they are interacting with each other—that’s a whole other story. But that isn’t what is happening. People are focusing on sexy social-media stuff and pushing it much further than they should be. The important part, as both scientists and businesspeople, is to understand what our limits are and to use the best possible science to fill in the gaps. All the data in the world will never achieve that goal for us.”

Amazing. Apparently Gayo-Avello and Fader never heard that more data beats sampling (“the big data blasphemy” per Meta S.  Brown) and that science has finally thrown off the shackles of hypothesis-making.

While a lot of money will continue to drive blind exploration in both science and business, I’m certain that the advancements and triumphs of the future will come from cool minds developing imaginative models and theories and testing them with the help of new big data tools and technologies. The trouble with big data may be just the hype surrounding it.  

TAGGED:big databig data analyticsData Sciencesamplingscience
Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

data analytics and truck accident claims
How Data Analytics Reduces Truck Accidents and Speeds Up Claims
Analytics Big Data Exclusive
predictive analytics for interior designers
Interior Designers Boost Profits with Predictive Analytics
Analytics Exclusive Predictive Analytics
big data and cybercrime
Stopping Lateral Movement in a Data-Heavy, Edge-First World
Big Data Exclusive
AI and data mining
What the Rise of AI Web Scrapers Means for Data Teams
Artificial Intelligence Big Data Exclusive

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

Automation Tools
Big DataBusiness IntelligenceData ManagementData Mining

3 Ways Automation Tools Use Big Data To Drive Business Growth

6 Min Read
Duplicate Data
Best PracticesData Management

How Duplicate Data Can Sideline Your Team’s Productivity

4 Min Read
AI
Artificial IntelligenceBig Data

5 Big Data Trends That Will Change AI In 2018

6 Min Read
big data business intelligence
AnalyticsBig DataBusiness IntelligenceExclusive

How Big Data is Changing the Face of the Global Marketplace

6 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

AI chatbots
AI Chatbots Can Help Retailers Convert Live Broadcast Viewers into Sales!
Chatbots
AI and chatbots
Chatbots and SEO: How Can Chatbots Improve Your SEO Ranking?
Artificial Intelligence Chatbots Exclusive

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?