Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    data analytics
    How Data Analytics Can Help You Construct A Financial Weather Map
    4 Min Read
    financial analytics
    Financial Analytics Shows The Hidden Cost Of Not Switching Systems
    4 Min Read
    warehouse accidents
    Data Analytics and the Future of Warehouse Safety
    10 Min Read
    stock investing and data analytics
    How Data Analytics Supports Smarter Stock Trading Strategies
    4 Min Read
    predictive analytics risk management
    How Predictive Analytics Is Redefining Risk Management Across Industries
    7 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: Need for a Robust Data Quality Framework for Big Data
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Big Data > Data Quality > Need for a Robust Data Quality Framework for Big Data
Data Quality

Need for a Robust Data Quality Framework for Big Data

koolhits
koolhits
3 Min Read
SHARE

The challenges associated with data quality and corresponding accountability across business domains and research areas has been a concern. Among the key data quality problems associated are:-

The challenges associated with data quality and corresponding accountability across business domains and research areas has been a concern. Among the key data quality problems associated are:-

  • Non-interoperability – Data collected in one system are not electronically transmittable to other systems. Re-inputting the same data in multiple systems consumes resources and increases the potential for data-entry errors.
  • Non-standardized data definitions – Various data providers use different definitions for the same elements. Passed on to the district or state level, non-comparable data are aggregated inappropriately to produce inaccurate results.
  • Unavailability of data – Data required do not exist or are not readily accessible ecause of one or other quality issue. In some cases, data providers may take an approach of “just fill something in” to satisfy distant data collectors, thus creating errors.
  • Inconsistent item response – Not all data providers report the same data elements. Idiosyncratic reporting of different types of information from different sources creates gaps and errors in macro-level data aggregation.
  • Inconsistency over time. The same data element is calculated, defined, and/or reported differently from year to year. Longitudinal inconsistency creates the potential for inaccurate analysis of trends over time.
  • Data entry errors. Inaccurate data are entered into a data collection instrument. Errors in reporting information can occur at any point in the process – from the student’s assessment answer sheet to the state’s report to the federal government.
  • Lack of timeliness. Data are reported too late. Late reporting can jeopardize the completeness of macro-level reporting.

We seriously require some thoughts and readily implementable approach where key business rules can be defined just like other business rules; ensuring proactive reporting of quality issues, checkpoints on new data being inserted and so on.

More Read

SAP and The Hoover Dam
The Benefits of Brevity
The Data Paradox
Investigating the Potential of Data Preparation
A Deep Dive in Big Data

Imagine, if we have a framework which can ensure some of following validation rules:-

  1. Range Check – This checks that the data lies within a specified range of values
  2. Presence Check – This checks that the required data is not missing
  3. Domain Check – This checks that only certain values are accepted
  4. Cross-Field Check – This checks that multiple fields in combination are valid
  5. Cross-Table Check – This checks that multiple tables in combination are valid
  6. Uniqueness Validation – Ensure the values in a column are unique
  7. Reference Integrity Validation – Validate values between tables in relational database model
  8. Duplicate Identification – Identify a row as an unwanted duplicate record
  9. Format Consolidation – Control data values inside a preset mask pattern
  10. Business Rule Compliance


Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

protecting patient data
How to Protect Psychotherapy Data in a Digital Practice
Big Data Exclusive Security
data analytics
How Data Analytics Can Help You Construct A Financial Weather Map
Analytics Exclusive Infographic
AI use in payment methods
AI Shows How Payment Delays Disrupt Your Business
Artificial Intelligence Exclusive Infographic
financial analytics
Financial Analytics Shows The Hidden Cost Of Not Switching Systems
Analytics Exclusive Infographic

Stay Connected

1.2KFollowersLike
33.7KFollowersFollow
222FollowersPin

You Might also Like

What Does Data Quality Technology Want?

4 Min Read

Splunk: Bringing Big Data Analysis to the Rest of Us

5 Min Read

72% of People Aren’t Familiar with Hosted VoIP

4 Min Read

Quick Strata update

2 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

AI and chatbots
Chatbots and SEO: How Can Chatbots Improve Your SEO Ranking?
Artificial Intelligence Chatbots Exclusive
ai chatbot
The Art of Conversation: Enhancing Chatbots with Advanced AI Prompts
Chatbots

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?