On Trees, Data Quality, and Big Data

November 25, 2012
39 Views

I recently had some palm trees put into my backyard in my Nevada home. It was a downright cool experience that required industrial cranes to life the two-ton trees above my home.

There was no damage done to my home or the driveway and nobody was injured. Everything went well. Well, almost everything. It turns out that the truck carrying these trees was delayed.

I recently had some palm trees put into my backyard in my Nevada home. It was a downright cool experience that required industrial cranes to life the two-ton trees above my home.

There was no damage done to my home or the driveway and nobody was injured. Everything went well. Well, almost everything. It turns out that the truck carrying these trees was delayed.

Did the drivers have hard time loading these monstrosities? No. Was the enormous truck able to snake its way into my community? Yes. So, what was the problem?

Good old human error. When I bought the trees, my friend Jeff accompanied me. Jeff knows a thing or two about landscaping and I’m anything but a palm tree expert. I paid for the trees and gave the woman at the counter my proper address. I assume that all was good to go.

Fast forward to tree delivery day. After a few hurried calls and general wonderment about where these things were, we identified the culprit. The saleswoman wrote down Jeff’s address on the deliver-to line, not mine. She put my address in the ‘notes’ section. For their part, the delivery guys didn’t read the notes and wound up driving 60 miles out of the way.

Lessons

I’ve seen many parallels between palm trees and enterprise data in my career. I’ve had users question the accuracy of my reports. I might hear things like “There’s no way that had that many promotions last month! You’re report is wrong!”

While I’m not perfect, I would often tell the skeptical user that we should check the data in the source system. More often than not, my report was accurate but the data pulled into that report was not. Thanks to audit tables and metadata, I could typically pinpoint the time, date, and creator of the errant record.

I would then work backwards. That is, after we knew that a user made this mistake, I would ask the natural next questions:

  • What other mistakes did this user make?
  • What else do we have to clean up?
  • Is there a larger departmental or organizational training issue?
  • Couldn’t we write a business rule or audit report to prevent the recurrence of this problem?

Simon Says

Everyone makes mistakes, and I’m certainly no exception. The larger point here is that data matters, especially the accurate kind. One of my favorite expressions is PICNIC–aka, problem in chair, not in computer. We can do simply amazing things with Big Data, but I’ll always insist that Small Data and data quality are just as important.

You may be interested

IEEE Big Data Conference 2017 to Highlight Challenges, Opportunities
Big Data
65 shares1,032 views
Big Data
65 shares1,032 views

IEEE Big Data Conference 2017 to Highlight Challenges, Opportunities

Ryan Kade - June 23, 2017

Since 2013, the Institute of Electrical and Electronics Engineers has held annual big data conferences to highlight changes and opportunities…

10 of the Top Marketing BI Software Options
Business Intelligence
117 shares1,503 views
Business Intelligence
117 shares1,503 views

10 of the Top Marketing BI Software Options

Hayden B. - June 23, 2017

Business can be complicated sometimes. It’s not always easy to keep track of all the data and information we deal…

The Race for 5G Is the Race for Data Dominance
Big Data
80 shares1,164 views
Big Data
80 shares1,164 views

The Race for 5G Is the Race for Data Dominance

Daniel Matthews - June 22, 2017

Have you noticed how often the phrase “by the year 2020” comes up? In the tech sphere, many are heralding…