The Data Curves

Supply of -– and demand for –- data have never been higher. Executives often say, “I have all of the data that I can handle. I just need more information.”

Contents

Opposing Forces
Simon Says
Feedback

Consider just the data generated from mobile devices. From a recent ReadWriteWeb piece:

Supply of -– and demand for –- data have never been higher. Executives often say, “I have all of the data that I can handle. I just need more information.”

Consider just the data generated from mobile devices. From a recent ReadWriteWeb piece:

Worldwide mobile data traffic is due to increase 26-fold to 75 exabytes annually, says networking giant Cisco in its latest report, the Cisco Visual Networking Index Global Mobile Data Traffic Forecast for 2010 to 2015. To put that in perspective, that’s the equivalent of 19 billion DVDs, 536 quadrillion SMS text messages or 75 times the amount of global Internet IP data (fixed and mobile data) in the year 2000.

In a word, wow. At least there is some good news. Storing data has never been cheaper, a sign that shows no signs of abating. Consider that Amazon just announced its new AWS pricing. Outbound data transfer price for US-Standard, US-West and Europe regions are as follows:

$0.000 – first 1 GB / month data transfer out
$0.150 per GB – up to 10 TB / month data transfer out
$0.110 per GB – next 40 TB / month data transfer out
$0.090 per GB – next 100 TB / month data transfer out
$0.080 per GB – data transfer out / month over 150 TB

So, if I have 9 terabytes of data, outbound transfer costs will amount to $135 (USD)/month.

Opposing Forces

In other words, the amount of data out there is growing exponentially while its cost is dropping in a similar manner. This is crudely represented in my simple chart below:

chart

So, every day, we are generating more data and storing it becomes cheaper. But do we need to store all of that data? The answer is usually no. This, of course, begs the obvious question: Which data do we need to keep?

And there’s the rub. I am of the opinion that it’s better to have it and not need it than need it and not have it. What’s more, keeping the data in one central repository is generally the way to go. Among the benefits of a “master data” strategy is the minimization of those pesky duplicate records that vex end-users trying to make business decisions based upon extra, inaccurate, or incomplete information.

What’s more, reporting is vastly simplified. To be sure, sophisticated reporting tools allow developers and extremely technical folks to pull data from myriad sources, tables and data models. However, performance tends to suffer and many non-technical users aren’t exactly proficient at writing complex SQL statements. This maximizes the need for IT to be involved when, in an ideal world, they should not have to “bless” each reporting request.

Simon Says

It’s never easy to determine which data is essential, which is nice-to-have, and which is no longer relevant. To boot, ask ten different users in an organization and you’re likely to receive ten different answers. Still, it behooves most organizations to routinely ask if they are storing the right information in the right manners. After all, needs are hardly static, especially these days.

Data storage is anything but a “set it and forget it” type of thing. Ignore changing requirements at your own peril.