By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData Collective
  • Analytics
    AnalyticsShow More
    data analytics in sports industry
    Here’s How Data Analytics In Sports Is Changing The Game
    6 Min Read
    data analytics on nursing career
    Advances in Data Analytics Are Rapidly Transforming Nursing
    8 Min Read
    data analytics reveals the benefits of MBA
    Data Analytics Technology Proves Benefits of an MBA
    9 Min Read
    data-driven image seo
    Data Analytics Helps Marketers Substantially Boost Image SEO
    8 Min Read
    construction analytics
    5 Benefits of Analytics to Manage Commercial Construction
    5 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-23 SmartData Collective. All Rights Reserved.
Reading: Seven Misconceptions about Data Quality
Share
Notification Show More
Latest News
big data mac performance
Data-Driven Tips to Optimize the Speed of Macs
News
3 Ways AI Has Helped Marketers and Creative Professionals Streamline Workflows
3 Ways AI Has Helped Marketers and Creative Professionals Streamline Workflows
Artificial Intelligence
data analytics in sports industry
Here’s How Data Analytics In Sports Is Changing The Game
Big Data
data analytics on nursing career
Advances in Data Analytics Are Rapidly Transforming Nursing
Analytics
data analytics reveals the benefits of MBA
Data Analytics Technology Proves Benefits of an MBA
Analytics
Aa
SmartData Collective
Aa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Big Data > Data Mining > Seven Misconceptions about Data Quality
Business IntelligenceData MiningData QualityData Visualization

Seven Misconceptions about Data Quality

RickSherman
Last updated: 2011/08/25 at 9:44 AM
RickSherman
11 Min Read
SHARE

Word_data-quality007 The narrow definition of data quality is that it’s about bad data – data that is missing or incorrect. A broader definition is that data quality is achieved when a business uses data that is comprehensive, consistent, relevant and timely.

Word_data-quality007 The narrow definition of data quality is that it’s about bad data – data that is missing or incorrect. A broader definition is that data quality is achieved when a business uses data that is comprehensive, consistent, relevant and timely. If you focus only on the narrow data definition you may be lulled into a false security when, in fact, your efforts fall short. We will address several more misconceptions about data quality.

In order to fix a problem you have to recognize you have a problem. According to recent Gartner research, 25 percent of Fortune 1000 companies are working with poor quality data. The Data Warehousing Institute (TDWI) estimated that data quality problems cost U.S. businesses $600 billion each year. Regulatory initiatives such as Sarbanes-Oxley and Basel II dictate that companies must provide transparent data. But even with the documented high costs of poor data quality and the tight regulatory environment, many companies are turning a blind eye to their data quality problems. Why? Perhaps it is because of their mistaken belief that bad data is the only data quality issue they need to worry about.

A corollary to the above: to fix a problem you first have to take responsibility for it. That’s the rub. Taking responsibility is the biggest roadblock to dealing with data quality. In order to achieve a high level of quality, data has to be viewed from an enterprise and holistic perspective. Data may be correct within each data silo, but the information will not be consistent, relevant or timely when viewed across the enterprise. To make matters worse, you’ve got each report or analysis interpreting the data differently, so even when the numbers start off the same in each silo, the end results will not be consistent. Data is a corporate asset and has to be consistent across the entire corporation, not just within the business function or division where it originated.

More Read

3 Ways AI Has Helped Marketers and Creative Professionals Streamline Workflows

3 Ways AI Has Helped Marketers and Creative Professionals Streamline Workflows

5 Proven Tips for Utilizing AI with PPC Advertising in 2023
5 Ways AI Technology Has Disrupted Website Development
Fortifying Enterprise Digital Security Against Hackers Weaponizing AI
Data Visualization Boosts Business Scalability with Sales Mapping

Misconception #1: You Can Fix Data
Fixing implies that there was something wrong with the original data, and you can fix it once and be done with it. In reality, the problem may have been not with the data itself, but rather in the way it was used. When you manage data you manage data quality. It’s an ongoing process. Data cleansing is not the answer to data quality issues. Yes, data cleansing does address some important data quality problems has and offers a solid business value ROI, but it is only one element of the data quality puzzle. Too often the business purchases a data cleansing tool and thinks the problem is solved. In other cases, because the cost of data cleansing tools is high, a business may decide that it is too expensive for them to deal with the problem.

Misconception #2: Data Quality is an IT Problem
Data quality is a company problem that costs a business in many ways. Although IT can help address the problem of data quality, the business has to own the data and the business processes that create or use it. The business has to define the metrics for data quality – its completeness, consistency, relevancy and timeliness. The business has to determine the threshold between data quality and ROI. IT can enable the processes and manage data through technology, but the business has to define it. For an enterprise-wide data quality effort to be initiated and successful on an ongoing basis, it needs to be truly a joint business and IT effort.

Misconception #3: The Problem is in the Data Sources or Data Entry
Data entry or operational systems are often blamed for data quality problems. Although incorrectly entered or missing data is a problem, it is far from the only data quality problem. Also, everyone blames their data quality problems on the systems that they sourced the data from. Although some of that may be true, a large part of the data quality issue is the consistency, relevancy and timeliness of the data. If two divisions are using different customer identifiers or product numbers, does it mean that one of them has the wrong numbers or is the problem one of consistency between the divisions? If the problem is consistency, then it is an enterprise issue, not a divisional issue. The long-term solution may be for all divisions to use the same codes, but that has to be an enterprise decision.

The larger issue is that you need to manage data from its creation all the way to information consumption. You need to be able to trace its flow from data entry, transactional systems, data warehouse, data marts and cubes all the way to the report or spreadsheet used for the business analysis. Data quality requires tracking, checking and monitoring data throughout the entire information ecosystem. To make this happen you need data responsibility (people), data metrics (processes) and meta data management (technology). (We’ll address how in a future column.)

Misconception #4: The Data Warehouse will Provide a Single Version of the Truth
In an ideal world, every report or analysis performed by the business exclusively uses data sourced from the data warehouse – data that has gone through data cleansing and quality processes and includes constant interpretations such as profit or sales calculations. If everyone uses the data warehouse’s data exclusively and it meets your data quality metrics then it is the single version of the truth.

However, two significant conditions lessen the likelihood that the data warehouse solves your data quality issues by itself. First, people get data for their reports and analysis from a variety of data sources – data warehouse (sometimes there are multiple data warehouses in an enterprise), data marts and cubes (that you hope were sourced from the data warehouse). They also get data from systems such as ERP, CRM, and budgeting and planning systems that may be sourced into the data warehouse themselves. In these cases, ensuring data quality in the data warehouse alone is not enough. Multiple data silos mean multiple versions of the truth and multiple interpretations of the truth. Data quality has to be addressed across these data silos, not just in the data warehouse.

Second, data quality involves the source data and its transformation into information. That means that even if every report and analysis gets data from the same data warehouse, if the business transformations and interpretations in these reports are different then there still are significant data quality issues. Data quality processes need to involve data creation; the staging of data in data warehouses, data marts, cubes and data shadow systems; and information consumption in the form of reports and business analysis. Applying data quality to the data itself and not its usage as information is not sufficient.

Misconception #5: The ERP System will Provide a Single Version of the Truth
Ditto what I said for Misconception #4.

Misconception #6: The Corporate Performance Management (CPM) System will Provide a Single Version of the Truth
Ditto what I said for Misconception #4.

Misconception #7: BI Standardization will Eliminate the Problem of Different “Truths” Represented in the Reports or Analysis
Yes, standardizing on BI tools can save money and may be a worthwhile project. But, don’t lose sight of the fact that the use of different BI tools is a symptom of a data quality problem, not the cause. If you pull the same data and implement the same transformations (formulas) in different BI tools you get the same results. The report, chart or dashboard may look a little different, but the numbers would be the same. The problem, therefore, is not that different BI tools are being used, but that each project implementing these tools built a different data mart or cube and then applied different formulas in their reports or analysis. Using the same BI tool in different projects that use different data with different transformations is still going to yield different results – and hence the data quality issues still remain. The cause of the data quality issues was the lack of consistency between the data used and data transformations, not the use of different BI tools.

Data quality is defined as comprehensive, consistent, relevant and timely data for use by the business. Don’t shrug it off as issue of bad data entry. Data needs to be addressed on an enterprise scale and in a holistic manner incorporating people, processes and technology.

RickSherman August 25, 2011
Share this Article
Facebook Twitter Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

big data mac performance
Data-Driven Tips to Optimize the Speed of Macs
News
3 Ways AI Has Helped Marketers and Creative Professionals Streamline Workflows
3 Ways AI Has Helped Marketers and Creative Professionals Streamline Workflows
Artificial Intelligence
data analytics in sports industry
Here’s How Data Analytics In Sports Is Changing The Game
Big Data
data analytics on nursing career
Advances in Data Analytics Are Rapidly Transforming Nursing
Analytics

Stay Connected

1.2k Followers Like
33.7k Followers Follow
222 Followers Pin

You Might also Like

3 Ways AI Has Helped Marketers and Creative Professionals Streamline Workflows
Artificial Intelligence

3 Ways AI Has Helped Marketers and Creative Professionals Streamline Workflows

6 Min Read
ai in ppc advertising
Artificial Intelligence

5 Proven Tips for Utilizing AI with PPC Advertising in 2023

10 Min Read
ai in web design
Artificial Intelligence

5 Ways AI Technology Has Disrupted Website Development

7 Min Read
Digital Security From Weaponized AI
Security

Fortifying Enterprise Digital Security Against Hackers Weaponizing AI

11 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

AI and chatbots
Chatbots and SEO: How Can Chatbots Improve Your SEO Ranking?
Artificial Intelligence Chatbots Exclusive
giveaway chatbots
How To Get An Award Winning Giveaway Bot
Big Data Chatbots Exclusive

Quick Link

  • About
  • Contact
  • Privacy
Follow US

© 2008-23 SmartData Collective. All Rights Reserved.

Removed from reading list

Undo
Go to mobile version
Welcome Back!

Sign in to your account

Lost your password?