Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    media monitoring
    Signals In The Noise: Using Media Monitoring To Manage Negative Publicity
    5 Min Read
    data analytics
    How Data Analytics Can Help You Construct A Financial Weather Map
    4 Min Read
    financial analytics
    Financial Analytics Shows The Hidden Cost Of Not Switching Systems
    4 Min Read
    warehouse accidents
    Data Analytics and the Future of Warehouse Safety
    10 Min Read
    stock investing and data analytics
    How Data Analytics Supports Smarter Stock Trading Strategies
    4 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: Statistical Rules of Thumb, Part III: Always Visualize the Data
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Big Data > Data Visualization > Statistical Rules of Thumb, Part III: Always Visualize the Data
Data VisualizationStatistics

Statistical Rules of Thumb, Part III: Always Visualize the Data

DeanAbbott
DeanAbbott
2 Min Read
SHARE

As I perused Statistical Rules of Thumb again, as I do from time to time, I came across this gem. (note: I live in CA, so get no money from these Amazon links).

As I perused Statistical Rules of Thumb again, as I do from time to time, I came across this gem. (note: I live in CA, so get no money from these Amazon links).

Van Belle uses the term “Graph” rather than “Visualize”, but it is the same idea. The point is to visualize in addition to computing summary statistics. Summaries are useful, but can be deceiving; any time you summarize data you will lose some information unless the distributions are well behaved. The scatterplot, histogram, box and whiskers plot, etc. can reveal ways the summaries can fool you. I’ve seen these as well, especially variables with outliers or that are bi- or tri-modal.

One of the most famous examples of this effect is Anscombe’s Quartet. I’m including the Wikipedia image of the plots here:

More Read

How Do You Turn Supply Chain Data into Actionable Information?
Big Data Blasphemy: Why Sample?
From Raw Data to Visualization: Marvel Social Graph Analysis
Applying Data Analytics to Customer Experience and Service on Social Media
The Data Analytics of Super Tuesday

All four datasets have the same mean x values, y values, x standard deviation, y standard deviation, x-y pearson correlation coefficient, and regression line of y, so the summaries don’t tell the differences in the data.

I use correlations a lot to get the gist of the relationships in the data, and I’ve seen how correlations can deceive. In one project, we had 30K data points with a correlation of 0.9+. When we removed just 100 of these data points (the largest magnitudes of x and y), the correlation shrunk to 0.23.

Most data mining software has ways to visualize data easily now. Avail yourself to them to avoid subsequent surprises in your data.

Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

data science professor
The Power of Warm-Ups: Setting the Stage for Learning
Exclusive News
cloud dataops for metering
Taming the IoT Firehose: How Utilities Are Scaling Cloud DataOps for Smart Metering
Cloud Computing Exclusive Internet of Things IT
ai in video game development
Machine Learning Is Changing iGaming Software Development
Exclusive Machine Learning News
media monitoring
Signals In The Noise: Using Media Monitoring To Manage Negative Publicity
Analytics Exclusive Infographic

Stay Connected

1.2KFollowersLike
33.7KFollowersFollow
222FollowersPin

You Might also Like

chart design
Data VisualizationUncategorized

The Indispensable Guide to Chart Design and Data Visualization [PART 1]

3 Min Read

Want to Experience SAP BusinessObjects Explorer? Try This Micro-Finance Demo!

2 Min Read

Support Vector Clustering: An Approach to Overcome the Limits of K-means

5 Min Read

Open Data App for the Paris Métro

3 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

data-driven web design
5 Great Tips for Using Data Analytics for Website UX
Big Data
AI chatbots
AI Chatbots Can Help Retailers Convert Live Broadcast Viewers into Sales!
Chatbots

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?