Data Cleanup: Michael Clayton and The Consultant’s Role


I recently attended the DataFlux IDEAS 2010 Conference in balmy Palm Springs, CA. I finally had the chance to meet quite a few thought leaders in the field.


I recently attended the DataFlux IDEAS 2010 Conference in balmy Palm Springs, CA. I finally had the chance to meet quite a few thought leaders in the field.

As expected, one of the principal topics of the conference involved data issues. Put simply, some organizations are quite good at dealing with them–while others continue to struggle. I have certainly come across more than my fair share of data issues in my day. (What consultant hasn’t?)

Many of the conference attendees were on the client side. Their presentations tended to focus on three to five year plans. Their ability to make long-term improvements in their organizations’ data impressed me a great deal for one simple reason: I never get to focus on the long-term. I am a fixer, much like the main character in the eponymous 2007 movie Michael Clayton. In this post, I discuss the role of people like me in dealing with enterprise data issues.

The Consultant’s Role

As a consultant, it is often my role to identify potential or probable data issues for a client. Whether using any number of specialized data cleanup tools or stalwarts such as Microsoft Access or Excel, I have found that it’s typically not terribly difficult to identify potentially questionable records. The key word here is potentially, as many records need to be manually examined in order to fix, consolidate, purge, or retire.

But let’s not get ahead of ourselves. Identification is simply the first step in the process–and often the easiest. After isolating suspect records, they must be investigated and ultimately fixed or consolidated. Here’s where it’s usually a good idea to stop using phrases such as “not terribly difficult.”

Treading Carefully

Some people become defensive when presented with data errors. When at a new client and unaware of the political terrain, I try to say very innocently that “someone may have done something wrong.” I find that it’s much less confrontational than pointing a finger, even when I know who did what when. Often, end users are quick to plead ignorance or blame predecessors for mistakes. In the event that they themselves have made the mistakes (audit trails are pretty hard to dispute), the tone of the conversation is quite different. There’s usually a reason that an end-user did what s/he did.

It’s the client’s role to ultimately make the final call on what to do with suspect records. Far too often, however, end-users do not have the time, desire, or skill set to make these calls. (I have written before on the different focuses on consultants and client end users.) Failure to address data issues in a timely manner typically causes many problems, from cascaded delays on other project tasks to incomplete testing.

Sometimes on IT projects, expectations are totally out of whack. Vendors are sometimes responsible for these chasms because, during the sales cycle, they underestimate the amount of time required to clean up key enterprise information. Let’s not forget that project managers during the engagement tend not to have expertise in data management issues. They tend not to know how long cleanup can take. Nor do clients, for that matter.

It’s the job of the consultant to let everyone know. Just don’t kill the messenger.


What say you?

Read more at MIKE2.0: The Open Source Standard for Information Management