By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData Collective
  • Analytics
    AnalyticsShow More
    data-driven image seo
    Data Analytics Helps Marketers Substantially Boost Image SEO
    8 Min Read
    construction analytics
    5 Benefits of Analytics to Manage Commercial Construction
    5 Min Read
    benefits of data analytics for financial industry
    Fascinating Changes Data Analytics Brings to Finance
    7 Min Read
    analyzing big data for its quality and value
    Use this Strategic Approach to Maximize Your Data’s Value
    6 Min Read
    data-driven seo for product pages
    6 Tips for Using Data Analytics for Product Page SEO
    11 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-23 SmartData Collective. All Rights Reserved.
Reading: Digital Universe Study: The Big Hype
Share
Notification Show More
Latest News
anti-spoofing tips
Anti-Spoofing is Crucial for Data-Driven Businesses
Security
ai in software development
3 AI-Based Strategies to Develop Software in Uncertain Times
Software
ai in ppc advertising
5 Proven Tips for Utilizing AI with PPC Advertising in 2023
Artificial Intelligence
data-driven image seo
Data Analytics Helps Marketers Substantially Boost Image SEO
Analytics
ai in web design
5 Ways AI Technology Has Disrupted Website Development
Artificial Intelligence
Aa
SmartData Collective
Aa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Big Data > Data Warehousing > Digital Universe Study: The Big Hype
Data WarehousingSecurity

Digital Universe Study: The Big Hype

Barry Devlin
Last updated: 2011/08/13 at 9:33 AM
Barry Devlin
5 Min Read
SHARE
- Advertisement -

In my last post, I discussed some of the key points in the 5th annual Digital Universe study from IDC, released by EMC in June.  Here, I consider a few more: some of the implications of the changes in sourcing on security and privacy, the importance of considering transient data, where volumes are a number of orders of magnitude higher, and a gentle reminder that bigger is not necessarily the nub of the problem.

In my last post, I discussed some of the key points in the 5th annual Digital Universe study from IDC, released by EMC in June.  Here, I consider a few more: some of the implications of the changes in sourcing on security and privacy, the importance of considering transient data, where volumes are a number of orders of magnitude higher, and a gentle reminder that bigger is not necessarily the nub of the problem.

Let’s start with transient data.  IDC notes that “a gigabyte of stored content can generate a petabyte or more of transient data that we typically don’t store (e.g., digital TV signals we watch but don’t record, voice calls that are made digital in the network backbone for the duration of a call)”.  Now, as an old data warehousing geek, that type of statement typically rings alarm bells: what if we miss some business value in the data that we never stored?  How can we ever recheck at a future date the results of an old analysis we made in real-time?  We used to regularly encounter this problem with DW implementations that focused on aggregated data, often because of the cost of storing the detailed data.  Over the years, decreasing storage costs meant that more warehouses moved to storing the detailed data.  But now, it seems like we are facing the problem again.  However, from a gigabyte to a petabyte is a factor of a million!  And, as the study points out, the “growth of the [permanent] digital universe continues to outpace the growth of storage capacity”.  So, this is probably a bridge to far for hardware evolution.

The implication (for me) is that our old paradigm about the need to keep raw, detailed data needs to be reconsidered, at least for certain types of data.  This leads to the point about “big data” and whether the issue is really about size at all.  The focus on size, which is the sound-bite for this study and most of the talk about big data, distracts us from the reality that this expanding universe of data contains some very different types of data to traditional business data and comes from a very different class of sources.  Simplistically, we can see two very different types of big data: (1) human-generated content, such as voice and video and (2) machine metric data such as website server logs and RFID sensor event data.  Both types are clearly big in volume, but in terms of structure, information value per gigabyte, retention needs and more, they are very different beasts.  And interesting to note that some vendors are beginning to specialize.  Infobright, for example, is focusing on what they call “machine-generated data”, a class of big data that is particularly suited to their technical strengths.

More Read

DNA and criminal data usage

The 5 Most Important Criminal DNA And Crime Data Sources

Big Data Meets Divorce: How Companies Take Advantage Of Life Changes
Does Facebook “Libra” Illustrate The Dark Side Of Big Data?
How Artificial Intelligence Puts James Bond Tracking Tools In Your Home
Big Data Generation: What Will the Future Look Like for Millennials?

Finally, a quick comment on security and privacy.  The study identifies the issues: “Less than a third of the information in the digital universe can be said to have at least minimal security or protection; only about half the information that should be protected is protected.”  Given how much information that consumers are willing to post on social networking sites or share with businesses in order to get a 1% discount, this is a significant issue that proponents of big data and data warehousing projects.  As we bring this data from social networking sources into our internal information-based decision-making systems, we will increasingly expose our business to possible charges of misusing information, exposing personal information, and so on.

There are many more thought-provoking observations in the Digital Universe study.  Well worth a read for anybody considering integrating data warehouse and big data.

TAGGED: privacy
Barry Devlin August 13, 2011
Share this Article
Facebook Twitter Pinterest LinkedIn
Share
- Advertisement -

Follow us on Facebook

Latest News

anti-spoofing tips
Anti-Spoofing is Crucial for Data-Driven Businesses
Security
ai in software development
3 AI-Based Strategies to Develop Software in Uncertain Times
Software
ai in ppc advertising
5 Proven Tips for Utilizing AI with PPC Advertising in 2023
Artificial Intelligence
data-driven image seo
Data Analytics Helps Marketers Substantially Boost Image SEO
Analytics

Stay Connected

1.2k Followers Like
33.7k Followers Follow
222 Followers Pin

You Might also Like

DNA and criminal data usage
Big DataExclusive

The 5 Most Important Criminal DNA And Crime Data Sources

9 Min Read
big data on divorce
Big DataExclusive

Big Data Meets Divorce: How Companies Take Advantage Of Life Changes

5 Min Read
facebook libra cryptocurrency
Big DataBlockchainExclusivePrivacy

Does Facebook “Libra” Illustrate The Dark Side Of Big Data?

6 Min Read
Artificial intelligence can help put tracking tools in your home that are a lot like James Bond. Here's what that means for safety, privacy, and security.
Artificial IntelligenceExclusive

How Artificial Intelligence Puts James Bond Tracking Tools In Your Home

6 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

AI chatbots
AI Chatbots Can Help Retailers Convert Live Broadcast Viewers into Sales!
Chatbots
data-driven web design
5 Great Tips for Using Data Analytics for Website UX
Big Data

Quick Link

  • About
  • Contact
  • Privacy
Follow US

© 2008-23 SmartData Collective. All Rights Reserved.

Removed from reading list

Undo
Go to mobile version
Welcome Back!

Sign in to your account

Lost your password?