Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    media monitoring
    Signals In The Noise: Using Media Monitoring To Manage Negative Publicity
    5 Min Read
    data analytics
    How Data Analytics Can Help You Construct A Financial Weather Map
    4 Min Read
    financial analytics
    Financial Analytics Shows The Hidden Cost Of Not Switching Systems
    4 Min Read
    warehouse accidents
    Data Analytics and the Future of Warehouse Safety
    10 Min Read
    stock investing and data analytics
    How Data Analytics Supports Smarter Stock Trading Strategies
    4 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: Leveraging Metadata for (Really) Big Data
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Analytics > Leveraging Metadata for (Really) Big Data
AnalyticsBig DataData ManagementExclusive

Leveraging Metadata for (Really) Big Data

Don DeLoach
Don DeLoach
6 Min Read
Image
SHARE

ImageThe word “metadata” has different meanings for different people. Most people think of this as the embodiment of big brother grabbing information about everything we do and say. More fundamentally, metadata is really data that describes other data.

ImageThe word “metadata” has different meanings for different people. Most people think of this as the embodiment of big brother grabbing information about everything we do and say. More fundamentally, metadata is really data that describes other data. In essence, it allows for quicker insight or easier interpretation of the data than one might get from analyzing all of the data at an atomic level. Some assume that key elements of the underlying data (like your name, who you called, or where you live) is pulled out of the overall data to create specific metadata. In that context, it is really more akin to indexing an underlying database for quicker analysis of the information you know you are going to want. 

Looking back, when reporting and analytics against more traditional relational databases began to be problematic due to the sheer volume of data contained within those databases, we began to see the rise in column based analytic databases which became the defacto approach. Moreover, many of these architectures were designed to be general purpose data warehouses, where the ability to horizontally scale to query larger data volumes in servicing the needs of the enterprise data warehouse works very well.

But, as machine-to-machine sensors, monitors, meters, etc. continue to fuel the Internet of Things, the enormous volumes of data is testing the capabilities of traditional database technologies, and creating a strain on infrastructures that did not contemplate the dramatic increase in the amount of data coming in, the way the data would need to be queried or the changing ways users would want to analyze data. This is why a different path is needed – an approach that is unique in that metadata is built on ingestions of creating indices or projections. 

More Read

Image
A Self-Driving Car Will Create 1 Gigabyte of Data Per Second: New Big Data Opportunity?
How to make a progress bar in R
The Technology Problems With Social Media ROI
How to Increase the Value of Your Social Media Measurement Strategy
Marketing Optimization with LityxIQ

This has its pros and cons. The limitations of the underlying structure of the data becomes more important based on how the mathematics associated with the metadata actually works. In a nutshell, the more the data looks and feels like machine data, the better. So it is not going to be ideally suited as a general purpose data warehouse. On the upside, it has the advantage of very fast load speeds, very tight compression, and exceptional maneuverability over the data to support high performance ad hoc queries and investigative analytics, and does not require a database administrator to manage the indices or tune the database. This is all because of the metadata. 

The really big (and really costly) database machines starting with Teradata and extending to IBM/Netezza, Oracle Exadata and SAP/Hana still have practical limitations in terms of the dataset volumes, as do the columnar stores like HP/Vertica, SAP/Sybase IQ, Red Shift and others. These were the enabling technologies when the wall associated with tens and sometimes hundreds of Terabytes became an impediment. But as we all know, data scale salvation was to be found in Hadoop, Cassandra, and other NoSQL variants. Or was it? There is no doubt that any number of use cases are realistically able to work using these technologies where they would have previously been limited.

But more often, we are seeing instances where the “long running query” is a problem, even in the new world of Hadoop and now Spark. This is especially true when there are multiple tables to be joined for complex queries. The idea of SQL on Hadoop solutions like Impala and Hive and Drill provide some relief, but it is hardly nirvana. If you need insight into the correlations that exist amongst multiple multi-petabyte tables, you might be waiting a while. But if that were not enough, virtually all projections suggest the volumes of data we must accommodate is now growing exponentially, primarily to do the rise of the Internet of Things. 

There is an old saying that “Necessity is the mother of invention.” In this case, necessity looks to be a function of the massive amount of data where insight into that information is a strategic advantage, if not a basic requirement. The amount of time and certainly the cost associated with gaining that insight is becoming increasingly impractical. 

This brings us back to metadata. We will start to hear vendors talking about Metadata more frequently. This makes logical sense, given where we are headed. The way metadata will be used as the market progresses forward is likely to be increasingly associated with addressing the gaps created by the time and money required to deal with atomic data. We can expect there will also be a variety of approaches using metadata based on the number of technology suppliers who so deeply care about this space. That’s a good thing. We will all be better off as the market as a whole evolves to meet our ever changing needs. 

Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

online business using analytics
Why Some Businesses Seem to Win Online Without Ever Feeling Like They Are Trying
Exclusive News
edi compliance with AI
AI Is Transforming EDI Compliance Services
Exclusive News
companies using big data
5 Industries Driving Big Data Technology Growth
Big Data Exclusive
software developer using ai
California AI Companies That Are Set for Long-Term Growth
Development Exclusive

Stay Connected

1.2KFollowersLike
33.7KFollowersFollow
222FollowersPin

You Might also Like

mobile intelligence
Artificial Intelligence

How is Mobile Intelligence Reshaping the Marketing Industry?

7 Min Read
decision management
AnalyticsBig DataBusiness IntelligenceDecision Management

Decision Management and In-Memory Technology

3 Min Read

#22: Here’s a thought…

7 Min Read
data analytics helps with bitcoin investing
Analytics

How a Danish Bitcoin Trader Discovered the Wonders of Analytics

10 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

ai in ecommerce
Artificial Intelligence for eCommerce: A Closer Look
Artificial Intelligence
giveaway chatbots
How To Get An Award Winning Giveaway Bot
Big Data Chatbots Exclusive

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?