Building a More Powerful Data Quality Scorecard

January 3, 2009
53 Views

Most data governance practitioners agree that a data quality scorecard is an important tool in any data governance program. It provides comprehensive information about quality of data in a database, and perhaps even more importantly, allows business users and technical users to collaborate on the quality issue.

However, if we show that 7% of all tables have data quality issues, the number is useless – there is no context. You can’t say whether it


Most data governance practitioners agree that a data quality scorecard is an important tool in any data governance program. It provides comprehensive information about quality of data in a database, and perhaps even more importantly, allows business users and technical users to collaborate on the quality issue.

However, if we show that 7% of all tables have data quality issues, the number is useless – there is no context. You can’t say whether it is good or bad, and you can’t make any decisions based on this information. There is no value associated with the score.

In an effort to improve processes, the data governance teams should roll-up the data into metrics into slightly higher formulations. In their book “Journey to Data Quality”, authors Lee, Pipino, Funk and Wang correctly suggest that making the measurements quantifiable and traceable provide the next level of transparency to the business. The metrics may be rolled up into a completeness rating, for example if your database contains 100,000 name and address postal codes and 3,500 records are incomplete, 3.5% of your postal codes failed and 96.5% pass. Similar simple formulas exist for Accuracy, Correctness, Currency and Relevance, too. However, this first aggregation still doesn’t support data governance, because business users aren’t thinking that way. They have processes that are supported by data and it’s still a stretch figuring out why this all matters.

Views of Data Quality Scorecard
Your plan must be to make data quality scorecards for different internal audiences – marketing, IT, c-level, etc.

The aggregation might look something like this:You must design the scorecards to meet the needs of the interest of the different audiences, from technical through to business and up to executive. At the beginning of a data quality scorecard is information about data quality of individual data records. This is the default information that most profilers will deliver out of the box. As you aggregate scores, the high-level measures of the data quality become more meaningful. In the middle are various score sets allowing your company to analyze and summarize data quality from different perspectives. If you define the objective of a data quality assessment project as calculating these different aggregations, you will have much easier time maturing your data governance program. The business users and c-level will begin to pay attention.

Business users are looking for whether the data supports the business process. They want to know if the data is facilitating compliance with laws. They want to decide whether their programs are “Go”, “Caution” or “Stop” like a traffic light. They want to know whether the current processes are giving them good data so they can change them if necessary. You can only do this by aggregating the information quality results and aligning those results with business.

Link to original post

You may be interested

Big Data Revolution in Agriculture Industry: Opportunities and Challenges
Analytics
69 shares1,955 views
Analytics
69 shares1,955 views

Big Data Revolution in Agriculture Industry: Opportunities and Challenges

Kayla Matthews - July 24, 2017

Big data is all about efficiency. There are many types of data available, and many ways to use that information.…

How SAP Hana is Driving Big Data Startups
Big Data
298 shares3,201 views
Big Data
298 shares3,201 views

How SAP Hana is Driving Big Data Startups

Ryan Kh - July 20, 2017

The first version of SAP Hana was released in 2010, before Hadoop and other big data extraction tools were introduced.…

Data Erasing Software vs Physical Destruction: Sustainable Way of Data Deletion
Data Management
156 views
Data Management
156 views

Data Erasing Software vs Physical Destruction: Sustainable Way of Data Deletion

Manish Bhickta - July 20, 2017

Physical Data destruction techniques are efficient enough to destroy data, but they can never be considered eco-friendly. On the other…