Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    sales and data analytics
    How Data Analytics Improves Lead Management and Sales Results
    9 Min Read
    data analytics and truck accident claims
    How Data Analytics Reduces Truck Accidents and Speeds Up Claims
    7 Min Read
    predictive analytics for interior designers
    Interior Designers Boost Profits with Predictive Analytics
    8 Min Read
    image fx (67)
    Improving LinkedIn Ad Strategies with Data Analytics
    9 Min Read
    big data and remote work
    Data Helps Speech-Language Pathologists Deliver Better Results
    6 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: When Big Data Can’t Predict
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Analytics > Predictive Analytics > When Big Data Can’t Predict
AnalyticsPredictive Analytics

When Big Data Can’t Predict

BillFranks
BillFranks
7 Min Read
Image
SHARE

Image

Contents
Scenario 1: Big Data, Small UniverseScenario 2: Big Data, Big Universe, Incredibly Rare EventsDon’t Despair, Prepare

Image

Most people think that in the age of big data, we always have more than enough information to build robust analytics. Unfortunately, this isn’t always the case. In fact, there are situations where even massive amounts of data still don’t enable even basic predictions to be made with confidence. In many cases, there isn’t much that can be done other than to recognize the facts and stick to the basics instead of getting fancy. This challenge of big data that can’t be used to predict seems like an impossible paradox at first, but let’s explore why it isn’t.

More Read

Bigger Data, Better Intelligence for Government
Derailing Your Supply Chain BI Project
Marketing Executives Aren’t Ready for the Social Explosion of Data
Interview: Don Springer, CEO of Collective Intellect, on Integrated Social Business & the Future of Social Media Metrics
Thanks, Big Data: America’s Drinking Habits Predict the Election

Scenario 1: Big Data, Small Universe

One example where issues arise is when we have a ton of data on a very small population. This makes it tough to find meaningful patterns. Let’s think about an airline manufacturer. Today’s airplanes generate terabytes of data every hour of operation. There are a lot of benefits that can come out of analyzing that data in terms of understanding things like how the engines are operating under differing conditions. However, at the same time, some exciting analytics like predictive maintenance can be difficult. Why is that?

Realize that even the biggest aircraft manufacturers only put out a few hundred airplanes per year. By the time the different models are taken into account, perhaps only a couple dozen of some models are produced in any given year. Even if the aircraft come fully loaded with sensors throughout, it will be hard to develop meaningful predictive part failure models. Why? Because with only a few dozen or hundred aircraft, the sample is too small.

This is exacerbated by the low failure rate of things like an engine (or engine component), especially on a new aircraft. So, while petabytes of data might be collected over a couple years of operation, there simply may not be enough aircraft to create a large enough pool of good and bad events from which to build predictive models that really work. Certainly, we can monitor the data to look for anomalous patterns that might support an investigation or intervention. But, that’s not a predictive model.

Scenario 2: Big Data, Big Universe, Incredibly Rare Events

There are other situations where there is a large universe of people or things to analyze and lots of data about them all. However, when events are exceedingly rare, you can still end up with a situation where there just aren’t enough exceptions to build truly effective predictive models. Again, this isn’t to say that there isn’t a lot of value in analyzing the data and understanding various aspects of the behavior of the people or things. It is simply saying that it may not be possible to build effective predictive models.

Let’s consider computer chips. Many millions, if not billions, of chips are produced each year and the rate is ever increasing. Decades ago, defects on the order of one in 10,000 or one in 100,000 might have been acceptable. With today’s chip-infused products, defects need to be closer to the one in millions level. I’ve had clients mention that there is pressure from the auto industry to drive chip defect rates down to one in a billion or less. Why is that?

The answer is that if any given new car has 1,000 chips in it in a few years, even small error rates start to translate into a lot of defective vehicles. With defect rates of one in 1,000,000 then about one of every thousand cars produced would have at least one critical defect. That translates to a lot of cost. It can also lead to lost lives if a chip fails in an autonomous vehicle and therefore causes it to malfunction while in operation. Hence, the push for incredibly low defect rates.

The issue becomes that if such low error rates are achieved, and if we can assume that there are a wide range of issues that could lead to a defective chip, there will be so few instances of any given defect happening for any specific set of reasons that we may never have enough of a sample to enable a good model to be produced to predict when and where those failures might occur. Considering chips are outdated and replaced with newer models within just a few years, it is quite plausible that this can be on ongoing issue.

Don’t Despair, Prepare

Keep in mind that the issues I’ve raised here are not the rule, but the exception. However, as data is collected from more and more sources and we analyze more and more aspects of our businesses, these exceptions are almost certain to pop up within your organization now and then. The important thing to do is simply to be on the lookout for cases where you have a very small universe to analyze, an incredibly rare event to analyze, or, worst of all, a rare event within a small universe. I am assuming, naturally, that you are only considering situations where the data is relevant to your business problem. Data that isn’t relevant will never add value no matter how big or small.

When occasions arise where you’re uncertain your data is going to be effective for prediction, make sure you assess what will plausibly be possible before investing too much energy into developing sophisticated analytics on the data. You may have to settle for basic analytics in some cases. It is important to keep in mind, however, that you should still be better off than if you had no data at all to analyze. That’s the upside to keep in mind instead of letting frustration get the best of you.


Share This Article
Facebook Pinterest LinkedIn
Share
ByBillFranks
Follow:
Bill Franks is Chief Analytics Officer for The International Institute For Analytics (IIA). Franks is also the author of Taming The Big Data Tidal Wave and The Analytics Revolution. His work has spanned clients in a variety of industries for companies ranging in size from Fortune 100 companies to small non-profit organizations. You can learn more at http://www.bill-franks.com.

Follow us on Facebook

Latest News

sales and data analytics
How Data Analytics Improves Lead Management and Sales Results
Analytics Big Data Exclusive
ai in marketing
How AI and Smart Platforms Improve Email Marketing
Artificial Intelligence Exclusive Marketing
AI Document Verification for Legal Firms: Importance & Top Tools
AI Document Verification for Legal Firms: Importance & Top Tools
Artificial Intelligence Exclusive
AI supply chain
AI Tools Are Strengthening Global Supply Chains
Artificial Intelligence Exclusive

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

Business Rules to Programmers – Methink thou doest protest too much I

9 Min Read

C-level Execs: Big Data Means Big Value

0 Min Read

Best Thinkers Webinar Series: Liberating Big Data

2 Min Read
Image
AnalyticsData ManagementRisk Management

5 Webapps to Add to Your Security Tool Arsenal

3 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

giveaway chatbots
How To Get An Award Winning Giveaway Bot
Big Data Chatbots Exclusive
ai is improving the safety of cars
From Bolts to Bots: How AI Is Fortifying the Automotive Industry
Artificial Intelligence

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?