Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    How Data Analytics Is Reshaping Patient Financing Decisions
    How Data Analytics Is Reshaping Patient Financing Decisions
    13 Min Read
    business using business intelligence
    How to Use a Competitive Intelligence Dashboard to Turn Market Data Into Smarter Marketing Decisions 
    9 Min Read
    unusual trading activity
    Signal Or Noise? A Decision Tree For Evaluating Unusual Trading Activity
    3 Min Read
    software developer using ai
    How Data Analytics Helps Developers Deliver Better Tech Services
    8 Min Read
    ai for stock trading
    Can Data Analytics Help Investors Outperform Warren Buffett
    9 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: The Trouble with Big Data
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Uncategorized > The Trouble with Big Data
Uncategorized

The Trouble with Big Data

gilpress
gilpress
6 Min Read
SHARE

Douglas Merrill, former CIO/VP of Engineering at Google, has issued an important warning about big data:

Douglas Merrill, former CIO/VP of Engineering at Google, has issued an important warning about big data:

“With too little data, you won’t be able to make any conclusions that you trust.  With loads of data you will find relationships that aren’t real… On net, having a degree in math, economics, AI, etc., isn’t enough. Tool expertise isn’t enough.  You need experience in solving real world problems, because there are a lot of important limitations to the statistics that you learned in school.  Big data isn’t about bits, it’s about talent.”  

More Read

Web Tracking and Analytics Data in Salesforce: Why They’re Necessary
Breaches of data confidentiality can be costly
Big Data Leads to Massive Changes in Website Management and Development
Apple Ups Their iWork Game in the Wake of Office 2016 Launch
Proactive Data Governance and the Economic Crisis

What Merrill is warning about and what he means by “talent,” I think, is the danger of blindly falling in love with correlations and not being able to develop a model that explains (or predicts) the relationships found. This is what I think David Smith means when he agrees with Merrill, saying that “this is a great illustration of why the data science process is a valuable one for extracting information from Big Data, because it combines tool expertise with statistical expertise and the domain expertise required to understand the problem and the data applicable to it.”

The “talent” of “understanding the problem and the data applicable to it” is what makes a good scientist: The required skepticism, the development of hypotheses (models), and the un-ending quest to refute them, following the scientific method that has brought us remarkable progress over the course of the last three hundred and fifty years.  

But this tradition is threatened by the excitement—too much excitement?—around big data. According to nextgove.com, Farnam Jahanian, chief of the National Science Foundation’s Computer and Information Science and Engineering Directorate, believes that “Big data has the power to change scientific research from a hypothesis-driven field to one that’s data-driven.” Explains Bill Perlowitz, chief technology officer of Wyle science: “In hypothetical science, you propose a hypothesis, you go out and gather data and you see if your hypothesis is supported. That limits your exploration to what you can imagine. It also limits the number of relationships you can explore because the human mind can only go so far. The shift with data-driven science and big data is that first we collect the data and then we see what it tells us. We don’t have a pretense that we understand what those relationships are, or what information we may find.” 

Sure, just take for example that fella Einstein, who had the “pretense” to speculate about the universe without having any data, big or small, to support his limited imagination…

At least one commentator on the nextgov post is not buying the next-big-thing-in-science, observing that “With Big Data, any clever analyst can find the data set with the right, spurious correlation to prove his bias. How not to fall for the ‘revelation-de-jour’ will be the Bid Data challenge.”

I’m not sure how much this misguided excitement around big data is a clear and present danger to science right now. But the threat to sound business decisions is quite evident and some scholars are fighting back. Technology Review reports that Daniel Gayo-Avello, at the University of Oviedo in Spain, “knocks Twitter’s predictive crown off altogether.” After reviewing the work of researchers who claim that Twitter’s data can predict election results, Gayo-Avello concluded that it is flawed because of the simple fact that “social media is not a representative and unbiased sample of the voting population.”

Gayo-Avello is joined, also on the pages of Technology Review, by Wharton’s Peter Fader, who responds unequivocally to a question about businesses that “promise to take a Twitter stream or a collection of Facebook comments and then make some prediction”:

“That is all ridiculous. If you can get me a really granular view of data—for example, an individual’s tweets and then that same individual’s transactions, so I can see how they are interacting with each other—that’s a whole other story. But that isn’t what is happening. People are focusing on sexy social-media stuff and pushing it much further than they should be. The important part, as both scientists and businesspeople, is to understand what our limits are and to use the best possible science to fill in the gaps. All the data in the world will never achieve that goal for us.”

Amazing. Apparently Gayo-Avello and Fader never heard that more data beats sampling (“the big data blasphemy” per Meta S.  Brown) and that science has finally thrown off the shackles of hypothesis-making.

While a lot of money will continue to drive blind exploration in both science and business, I’m certain that the advancements and triumphs of the future will come from cool minds developing imaginative models and theories and testing them with the help of new big data tools and technologies. The trouble with big data may be just the hype surrounding it.  

TAGGED:big databig data analyticsData Sciencesamplingscience
Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

How Data Analytics Is Reshaping Patient Financing Decisions
How Data Analytics Is Reshaping Patient Financing Decisions
Analytics Big Data Exclusive
AI driven big data company
How AI-Driven Workflows Are Changing the Way Companies Think About Data Risk
Artificial Intelligence Data Management Exclusive Risk Management
ai product development
Why Businesses Outsource AI Product Development Companies
Exclusive News
banking tools
The Fintech and Banking Tools Global Entrepreneurs Rely On
Fintech Infographic

Stay Connected

1.2KFollowersLike
33.7KFollowersFollow
222FollowersPin

You Might also Like

big data helping lending
Big DataData ManagementWorkforce Data

Big Data and Lending: A Match Made in Heaven?

5 Min Read
AI-driven SEO
Big DataData Mining

How Data Mining Tools Break Through Misconceptions To Optimize SEO

6 Min Read

Big Data, Intelligence and Multi-Faceted Innovation

12 Min Read
big data and vpn importance
Best PracticesData ManagementExclusivePrivacySecurity

Big Data Has Created A Surge In Demand For VPN Solutions

9 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

giveaway chatbots
How To Get An Award Winning Giveaway Bot
Big Data Chatbots Exclusive
data-driven web design
5 Great Tips for Using Data Analytics for Website UX
Big Data

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?