Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    data analytics and truck accident claims
    How Data Analytics Reduces Truck Accidents and Speeds Up Claims
    7 Min Read
    predictive analytics for interior designers
    Interior Designers Boost Profits with Predictive Analytics
    8 Min Read
    image fx (67)
    Improving LinkedIn Ad Strategies with Data Analytics
    9 Min Read
    big data and remote work
    Data Helps Speech-Language Pathologists Deliver Better Results
    6 Min Read
    data driven insights
    How Data-Driven Insights Are Addressing Gaps in Patient Communication and Equity
    8 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: The Data Quality Goldilocks Zone
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Uncategorized > The Data Quality Goldilocks Zone
Uncategorized

The Data Quality Goldilocks Zone

JimHarris
JimHarris
6 Min Read
SHARE

In astronomy, the habitable region of space where stellar conditions are favorable for life as it is found on Earth is referred to as the “Goldilocks Zone” because such a region of space is neither too close to the sun (making it too hot) nor too far away from the sun (making it too cold), but is “just right.” 

In data quality, there is also a Goldilocks Zone, which is the habitable region of time when project conditions are favorable for success. 

Too many projects fail because of lofty expectations, unmanaged scope creep, and the unrealistic perspective that data quality problems can be permanently “fixed” as opposed to needing eternal vigilance.  In order to be successful, projects must always be understood as an iterative process.  Return on investment (ROI) will be achieved by targeting well defined objectives that can deliver small incremental returns that will build momentum to larger success over time.  

Data quality projects are easy to get started, even easier to end in failure, and often lack the decency of at least failing quickly.  Just like any complex problem, there is no fast and easy solution for data quality. 

More Read

Plato’s cave
The Role of the Project Manager on a Failing IT Project
One on One with Content Management’s Movers and Shakers
Socializing Social Search
Plastic Logic and what could be the ultimate thin client

Projects …

In astronomy, the habitable region of space where stellar conditions are favorable for life as it is found on Earth is referred to as the “Goldilocks Zone” because such a region of space is neither too close to the sun (making it too hot) nor too far away from the sun (making it too cold), but is “just right.” 

In data quality, there is also a Goldilocks Zone, which is the habitable region of time when project conditions are favorable for success. 

Too many projects fail because of lofty expectations, unmanaged scope creep, and the unrealistic perspective that data quality problems can be permanently “fixed” as opposed to needing eternal vigilance.  In order to be successful, projects must always be understood as an iterative process.  Return on investment (ROI) will be achieved by targeting well defined objectives that can deliver small incremental returns that will build momentum to larger success over time.  

Data quality projects are easy to get started, even easier to end in failure, and often lack the decency of at least failing quickly.  Just like any complex problem, there is no fast and easy solution for data quality. 

Projects are launched to understand and remediate the poor data quality that is negatively impacting decision critical enterprise information.  Data-driven problems require data-driven solutions.  At that point in the project lifecycle when the team must decide if the efforts of the current iteration are ready for implementation, they are dealing with the Data Quality Goldilocks Zone, which instead of being measured by proximity to the sun, is measured by proximity to full data remediation, otherwise known as perfection. 

The obvious problem is that perfection is impossible.  An obsessive-compulsive quest to find and fix every data quality problem is a laudable pursuit but ultimately a self-defeating cause.  Data quality problems can be very insidious and even the best data remediation process will still produce exceptions.  As a best practice, your process should be designed to identify and report exceptions when they occur.  In fact, many implementations will include logic to provide the ability to suspend exceptions for manual review and correction. 

Although all of this is easy to accept in theory, it is notoriously difficult to accept in practice. 

For example, let’s imagine that your project is processing one billion records and that exhaustive analysis has determined that the results are correct 99.99999% of the time, meaning that exceptions occur in only 0.00001% of the total data population.  Now, imagine explaining these statistics to the project team, but providing only the 100 exception records for review.  Do not underestimate the difficulty that the human mind has with large numbers (i.e. 100 is an easy number to relate to but one billion is practically incomprehensible).  Also, don’t ignore the effect known as “negativity bias” where bad evokes a stronger reaction than good in the human mind – just compare an insult and a compliment, which one do you remember more often?  Focusing on the exceptions can undermine confidence and prevent acceptance of an overwhelmingly successful implementation. 

If you can accept there will be exceptions, admit perfection is impossible, implement data quality improvements in iterations, and acknowledge when the current iteration has reached the Data Quality Goldilocks Zone, then your data quality initiative will not be perfect, but it will be “just right.”

Link to original post

TAGGED:data quality
Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

data analytics and truck accident claims
How Data Analytics Reduces Truck Accidents and Speeds Up Claims
Analytics Big Data Exclusive
predictive analytics for interior designers
Interior Designers Boost Profits with Predictive Analytics
Analytics Exclusive Predictive Analytics
big data and cybercrime
Stopping Lateral Movement in a Data-Heavy, Edge-First World
Big Data Exclusive
AI and data mining
What the Rise of AI Web Scrapers Means for Data Teams
Artificial Intelligence Big Data Exclusive

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

Perfect Data and Other Data Quality Myths

5 Min Read

La Trahison des Données

6 Min Read

Ensuring quality data from service providers

5 Min Read

Stuck in First Gear

5 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

ai chatbot
The Art of Conversation: Enhancing Chatbots with Advanced AI Prompts
Chatbots
ai in ecommerce
Artificial Intelligence for eCommerce: A Closer Look
Artificial Intelligence

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?