How Much Big Data is Too Much?

August 3, 2012
33 Views

With storage costs plummeting and sophisticated software approaches to mining Big Data, it appears that it is increasingly cost effective for corporations and governments to keep all types of data, even those previously discarded.  However, how much “Big Data” should corporations, entities and governments keep online or archived, especially when “

With storage costs plummeting and sophisticated software approaches to mining Big Data, it appears that it is increasingly cost effective for corporations and governments to keep all types of data, even those previously discarded.  However, how much “Big Data” should corporations, entities and governments keep online or archived, especially when “Right to Be Forgotten” debates are swirling?

Like it or not, all kinds of data are captured every day. James Gleick in “The Information” sums it up nicely;

“The information produced and consumed by humankind used to vanish—that was the norm, the default. The sights, the sounds, the spoken word just melted away. Now the expectations have inverted. Everything may be recorded and preserved at least potentially; every musical performance, every crime, elevator, city street, every volcano or tsunami on the remotest shore…”

With petabytes of storage and virtual machines available in the cloud on a pay per use basis, and on premise storage costs dropping like a rock, it’s conceivable for companies and governments keep every image, video, recording, keystroke, and web generated data type. And of course, all these data are of little use without techniques to mine and perform information discovery. Fortunately BI and data warehousing technologies have worked wonders over the past thirty to forty years for data that needs to be organized, and we have MapReduce/Hadoop to assist in assembling/analyzing an organized data garbage dump.

There are two consequences of this data deluge. 

For individuals, there is the feeling of drowning in a sea of overwhelming data of which it’s difficult to manage much less scrutinize. Novelist David Foster Wallace called this scenario “Total Noise” to coin the feeling of drowning in a deep pool of too many tweets, posts, phone calls, podcasts and more. And because this total noise causes “information anxiety” for some, there are plenty of people deleting social media accounts.

And there is a second consequence of this data deluge. Since everything that can be captured is in the process of being captured, there are certainly privacy and security concerns. Our likes, rants, passions and partialities are recorded online and archived offline in perpetuity. These concerns have fomented potential privacy legislation such as the EU’s “Right to Be Forgotten” where digital providers—upon request—will need to cull digital references owned by individuals.

These consequences then beg the question, how much Big Data is too much? What should be kept for corporate reasons (to serve customers better, sell more products, optimize business processes etc)? What should be kept for governmental concerns (tracking bank flows for money laundering, watching for potential terrorist activity, monitoring fringe groups that don’t see eye to eye with government officials)?  And with pending legislation such as “Right to be Forgotten” considered in statehouses across the world, is it more hassle than it’s worth to keep all this Big Data, especially if there are financial penalties for not complying with legislation?

More and more people are looking into avenues to “erase themselves” from the web. However, the web is simply one information source, especially in our coming sensor driven society. Companies and governments will continue to have the option of capturing all kinds of data. The key question they’ll need to answer is, “How much Big Data is too much”? And perhaps for some entities, the answer will be that there are few constraints.

 

You may be interested

IEEE Big Data Conference 2017 to Highlight Challenges, Opportunities
Big Data
65 shares1,041 views
Big Data
65 shares1,041 views

IEEE Big Data Conference 2017 to Highlight Challenges, Opportunities

Ryan Kade - June 23, 2017

Since 2013, the Institute of Electrical and Electronics Engineers has held annual big data conferences to highlight changes and opportunities…

10 of the Top Marketing BI Software Options
Business Intelligence
117 shares1,513 views
Business Intelligence
117 shares1,513 views

10 of the Top Marketing BI Software Options

Hayden B. - June 23, 2017

Business can be complicated sometimes. It’s not always easy to keep track of all the data and information we deal…

The Race for 5G Is the Race for Data Dominance
Big Data
80 shares1,170 views
Big Data
80 shares1,170 views

The Race for 5G Is the Race for Data Dominance

Daniel Matthews - June 22, 2017

Have you noticed how often the phrase “by the year 2020” comes up? In the tech sphere, many are heralding…