Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    business using business intelligence
    How to Use a Competitive Intelligence Dashboard to Turn Market Data Into Smarter Marketing Decisions 
    9 Min Read
    unusual trading activity
    Signal Or Noise? A Decision Tree For Evaluating Unusual Trading Activity
    3 Min Read
    software developer using ai
    How Data Analytics Helps Developers Deliver Better Tech Services
    8 Min Read
    ai for stock trading
    Can Data Analytics Help Investors Outperform Warren Buffett
    9 Min Read
    media monitoring
    Signals In The Noise: Using Media Monitoring To Manage Negative Publicity
    5 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: Keeping Your Big Data Analysis Clean
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Big Data > Keeping Your Big Data Analysis Clean
Big Data

Keeping Your Big Data Analysis Clean

Rick Delgado
Rick Delgado
5 Min Read
SHARE

‘Outlier’ is a term that comes from statistics and data analytics. Math.com defines an outlier as “a value that lies outside (is much smaller or larger than) most of the other values in a set of data,” and it gives a sample of values for an example. If you have the values 25, 29, 3, 32, 85, 33, 27, and 28, both 3 and 85 are your outliers.”

‘Outlier’ is a term that comes from statistics and data analytics. Math.com defines an outlier as “a value that lies outside (is much smaller or larger than) most of the other values in a set of data,” and it gives a sample of values for an example. If you have the values 25, 29, 3, 32, 85, 33, 27, and 28, both 3 and 85 are your outliers.”

If you’re looking to become an outlier, and/or if you’re using a big data platform to optimize your entire business model—looking for outliers to either weed out or leverage—then it’s important to understand where outliers come from and how instructive or beneficial they are to your particular data set. Above all else, you must learn to recognize whether any outliers that crop up in your data analysis are the result of flaws in your analytics model, or if they’re anomalies particular to your specific business, and whether they’re something to be eliminated or enhanced. That level of understanding begins and ends with keeping your big data analysis clean. Here are a few tips for that.

More Read

big data in marketing
Data-Driven Marketing Strategies Will Be the Norm in The Post-Covid Era
What Do Big Data Professionals Need to Know About GDPR
SAP’s Social Layer: Making Collaboration Real
How To Share Data Safely Across Your Supply Chain
Google Paper on Parallel EM Algorithm using MapReduce
  1. Investigate and identify the cause — Not all outliers are the result of errors. They may be exactly what you’re looking for. However, sometimes outliers come from a transcription error or malfunctioning equipment that is reporting inaccurate values. Extreme outliers like these can negatively impact the accuracy of your analysis. So, you’ll want to either remove these values from the data set or fix the flaws causing them.

  1. Use data visualization tools — Data visualization tools make looking at trends and patterns in a large data set much easier than just looking at the numbers. Seeing anomalies is the first step to understanding them.

  1. Know the factors that may skew your data — In a bar full of average people plus Bill Gates, a measurement on the average income in the room would be skewed by the presence of Gates. French census data taken the year Napoleon Bonaparte was born would show nothing out of the ordinary, and yet, how much did he impact European census data throughout the nineteenth century? These examples show how outliers can heavily influence average values.

  1. Be agnostic about your outliers — In and of themselves, outliers are neither good nor bad. They are simply extreme values that may or may not be expected. Most of all, outliers are instructive. They represent risks, opportunities, mistakes, anomalies, or something else. Their usefulness is a product of context and how that relates to a company’s goals.

  1. Check your assumptions at the door — Assumptions about your data will mislead you and create biases that impact the outcome of your data analysis. It’s very common for people to overlook their underlying assumptions and biases. Try to keep an open mind about what the data tells you and try to look for alternate interpretations where possible. Sometimes the idea that data analysis will reveal a problem and point toward an eventual solution is an assumption itself.

Ironically most, if not all, businesses are applying big data platforms toward their ultimate goal of becoming outliers in their industry. That is to say, whatever set of variable factors influence the definition of the term, ‘outlier,’ in a given business landscape, whether it’s market share, gross revenue, stock prices, longevity, or some combination, to be the absolute best is to be the outlier. Today’s big data platforms are helping businesses to create powerful models for tracking and measuring trends, behaviors, and markets, but the results will only be as good as the analytical model. To become the outlier, you must first understand your own outliers.

Share This Article
Facebook Pinterest LinkedIn
Share
ByRick Delgado
Follow:
All things Big Data, Tech commentator, Enterprise Trends and every once in a while I write for @dell.

Follow us on Facebook

Latest News

AI driven big data company
How AI-Driven Workflows Are Changing the Way Companies Think About Data Risk
Artificial Intelligence Data Management Exclusive Risk Management
ai product development
Why Businesses Outsource AI Product Development Companies
Exclusive News
banking tools
The Fintech and Banking Tools Global Entrepreneurs Rely On
Fintech Infographic
business using business intelligence
How to Use a Competitive Intelligence Dashboard to Turn Market Data Into Smarter Marketing Decisions 
Analytics Big Data Exclusive Marketing

Stay Connected

1.2KFollowersLike
33.7KFollowersFollow
222FollowersPin

You Might also Like

Artificial Intelligence Needs Human Interactivity to Fulfill its Positive Potential

5 Min Read

Social Media, Corporate Decisions and Analytics

5 Min Read
tech industry and data science
Data Science

How People from Outside of the Tech Industry are Breaking into Data Science

6 Min Read

Transforming 100 Blog Posts into 1 Wordle

2 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

data-driven web design
5 Great Tips for Using Data Analytics for Website UX
Big Data
AI chatbots
AI Chatbots Can Help Retailers Convert Live Broadcast Viewers into Sales!
Chatbots

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?