Big Data Governance

December 7, 2011
447 Views

In which Jill wonders yet again how much size really matters.

It’s always interesting to hear somebody dismiss a trend.

“That’s not new!” he might say as he strokes his beard, lights a pipe, and mixes himself a Manhattan.  “I worked on that stuff my first job out of college for cripes sake!” Then he flips the Peter, Paul, and Mary record, remembering the good old days of bra-burning and punch cards.

In which Jill wonders yet again how much size really matters.

It’s always interesting to hear somebody dismiss a trend.

“That’s not new!” he might say as he strokes his beard, lights a pipe, and mixes himself a Manhattan.  “I worked on that stuff my first job out of college for cripes sake!” Then he flips the Peter, Paul, and Mary record, remembering the good old days of bra-burning and punch cards.

And so it is with the newest trend, Big Data. High tech companies looking for more efficient ways to process and store their web transactions are often credited with lighting the big data fire. Big data represents the collision of data warehouse, search, visualization, and storage worlds, and it brands the conundrum we’ve been facing (and largely ignoring): information is hitting companies at a faster rate than ever, and incumbent technology solutions are often too cumbersome or expensive to solve the problem.

So IT slips off its Birkenstocks and jumps into the technology sandbox to play with new toys like Hadoop, NoSQL, and grid computing. But in our conversations about big data, we overlook something just as important as the enabling technologies: the business-driven policy-making and oversight of all that big data. Yep. We’re forgetting data governance.

When it comes to big data most of my clients are still in research mode. As their advisor I’m bound to ask them that trite-yet-requisite Management Consulting Level-Setting Question: “What’s the need, pain, or problem you’re trying to solve?”

Often clients explain that they need to treat transaction data differently than they need to treat, say, customer master data. Fewer business rules, more history, that kind of thing.  That’s when we start the work of classifying different data domains according to varying business policies:

Jillblog-12-07-graphic

(click to enlarge)

Figure 1: Establishing Data Categories

Data classifications can get quite detailed, and there can be many categories. But if you’ve designed your data governance program the right way you should be able to apply Guiding Principles to each category.

This strategy can then be used to gain consensus around optimal data management tactics, business rules, provisioning processes and, yes, technology for each category. Maybe that technology includes grid arrays or Hadoop. Maybe you’ll realize you don’t need new technology for a given category. Either way, you’re circumscribing a taxonomy for your data. That’s when the realization hits that the size of the data doesn’t matter as much as how you use it.