The Top of the Data Quality Bell Curve

January 22, 2014
238 Views

Image“Information is the value associated with data,” William McKnight explains in his book Information Management: Strategies for Gaining a Competitive Advantage with Data.

Image“Information is the value associated with data,” William McKnight explains in his book Information Management: Strategies for Gaining a Competitive Advantage with Data.  “Information is data under management that can be utilized by the company to achieve goals.”  Does that data have to be perfect in order to realize its value and enable the company to achieve its goals?  McKnight says no.

Data quality, according to McKnight, “is the absence of intolerable defects.”

“It is not the absence of defects.  Every enterprise will have those.  It is the absence of defects that see us falling short of a standard in a way that would have real, measurable negative business impact.  Those negative effects could see us mistreating customers, stocking shelves erroneously, creating foolish marketing campaigns, or missing chances for expansion.  Proper data quality management is also a value proposition that will ultimately fall short of perfection, yet will provide more value than it costs.”

“The proper investment in data quality is based on a bell curve on which the enterprise seeks to achieve the optimal ROI at the top of the curve.”

Mark Twain once said, “few things are harder to put up with than the annoyance of a good example.”

McKnight’s book provides many good examples, one based on an e-commerce/direct mail catalog/brick-and-mortar enterprise that regularly interacts with its customers.

“For e-commerce sales, address information is updated with every order.  Brick-and-mortar sales may or may not capture the latest address, and direct mail catalog orders will capture the latest address.  However, if I place an order and move two weeks later, my data is out-of-date: short of perfection.”

This is why I don’t like the anti-data-cleansing mantra of getting data right, the first time, every time—because even when you get data right the first time, it’s not the last time data has to be managed.

“Perfection is achievable,” McKnight continued, “but not economically achievable.  For instance, an enterprise could hire agents in the field to knock on their customers’ doors and monitor the license plates of cars coming and going to ensure that they know to the day when a customer moves.  This would come closer to perfect data on the current address of consumers, but at tremendous cost (not to mention that it would irritate the customer).”

Not only is data perfection the asymptote of data quality that’s not economically achievable, data perfection is not the goal of information management.  The goal of information management is to help the enterprise achieve its goals by providing data-driven solutions for business problems, which, by their very nature, are dynamic challenges that rarely have (or require) a perfect solution.