Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    predictive analytics risk management
    How Predictive Analytics Is Redefining Risk Management Across Industries
    7 Min Read
    data analytics and gold trading
    Data Analytics and the New Era of Gold Trading
    9 Min Read
    composable analytics
    How Composable Analytics Unlocks Modular Agility for Data Teams
    9 Min Read
    data mining to find the right poly bag makers
    Using Data Analytics to Choose the Best Poly Mailer Bags
    12 Min Read
    data analytics for pharmacy trends
    How Data Analytics Is Tracking Trends in the Pharmacy Industry
    5 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: Leveraging Metadata for (Really) Big Data
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Analytics > Leveraging Metadata for (Really) Big Data
AnalyticsBig DataData ManagementExclusive

Leveraging Metadata for (Really) Big Data

Don DeLoach
Don DeLoach
6 Min Read
Image
SHARE

ImageThe word “metadata” has different meanings for different people. Most people think of this as the embodiment of big brother grabbing information about everything we do and say. More fundamentally, metadata is really data that describes other data.

ImageThe word “metadata” has different meanings for different people. Most people think of this as the embodiment of big brother grabbing information about everything we do and say. More fundamentally, metadata is really data that describes other data. In essence, it allows for quicker insight or easier interpretation of the data than one might get from analyzing all of the data at an atomic level. Some assume that key elements of the underlying data (like your name, who you called, or where you live) is pulled out of the overall data to create specific metadata. In that context, it is really more akin to indexing an underlying database for quicker analysis of the information you know you are going to want. 

Looking back, when reporting and analytics against more traditional relational databases began to be problematic due to the sheer volume of data contained within those databases, we began to see the rise in column based analytic databases which became the defacto approach. Moreover, many of these architectures were designed to be general purpose data warehouses, where the ability to horizontally scale to query larger data volumes in servicing the needs of the enterprise data warehouse works very well.

But, as machine-to-machine sensors, monitors, meters, etc. continue to fuel the Internet of Things, the enormous volumes of data is testing the capabilities of traditional database technologies, and creating a strain on infrastructures that did not contemplate the dramatic increase in the amount of data coming in, the way the data would need to be queried or the changing ways users would want to analyze data. This is why a different path is needed – an approach that is unique in that metadata is built on ingestions of creating indices or projections. 

More Read

Image
How Big Data Can Improve Manufacturing Quality
How to load your iPhone location data into R
What Does Data Archiving Bring To Healthcare Intelligence?
Predictive Analytics Improves Trading Decisions as Euro Rebounds
How to Ensure Data Lakes Success

This has its pros and cons. The limitations of the underlying structure of the data becomes more important based on how the mathematics associated with the metadata actually works. In a nutshell, the more the data looks and feels like machine data, the better. So it is not going to be ideally suited as a general purpose data warehouse. On the upside, it has the advantage of very fast load speeds, very tight compression, and exceptional maneuverability over the data to support high performance ad hoc queries and investigative analytics, and does not require a database administrator to manage the indices or tune the database. This is all because of the metadata. 

The really big (and really costly) database machines starting with Teradata and extending to IBM/Netezza, Oracle Exadata and SAP/Hana still have practical limitations in terms of the dataset volumes, as do the columnar stores like HP/Vertica, SAP/Sybase IQ, Red Shift and others. These were the enabling technologies when the wall associated with tens and sometimes hundreds of Terabytes became an impediment. But as we all know, data scale salvation was to be found in Hadoop, Cassandra, and other NoSQL variants. Or was it? There is no doubt that any number of use cases are realistically able to work using these technologies where they would have previously been limited.

But more often, we are seeing instances where the “long running query” is a problem, even in the new world of Hadoop and now Spark. This is especially true when there are multiple tables to be joined for complex queries. The idea of SQL on Hadoop solutions like Impala and Hive and Drill provide some relief, but it is hardly nirvana. If you need insight into the correlations that exist amongst multiple multi-petabyte tables, you might be waiting a while. But if that were not enough, virtually all projections suggest the volumes of data we must accommodate is now growing exponentially, primarily to do the rise of the Internet of Things. 

There is an old saying that “Necessity is the mother of invention.” In this case, necessity looks to be a function of the massive amount of data where insight into that information is a strategic advantage, if not a basic requirement. The amount of time and certainly the cost associated with gaining that insight is becoming increasingly impractical. 

This brings us back to metadata. We will start to hear vendors talking about Metadata more frequently. This makes logical sense, given where we are headed. The way metadata will be used as the market progresses forward is likely to be increasingly associated with addressing the gaps created by the time and money required to deal with atomic data. We can expect there will also be a variety of approaches using metadata based on the number of technology suppliers who so deeply care about this space. That’s a good thing. We will all be better off as the market as a whole evolves to meet our ever changing needs. 

Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

street address database
Why Data-Driven Companies Rely on Accurate Street Address Databases
Big Data Exclusive
predictive analytics risk management
How Predictive Analytics Is Redefining Risk Management Across Industries
Analytics Exclusive Predictive Analytics
data analytics and gold trading
Data Analytics and the New Era of Gold Trading
Analytics Big Data Exclusive
student learning AI
Advanced Degrees Still Matter in an AI-Driven Job Market
Artificial Intelligence Exclusive

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

big data
AnalyticsBig DataBusiness IntelligenceDecision ManagementStatisticsUnstructured DataWorkforce Data

Analytics at Google: Great Example of Data-Driven Decision-Making

8 Min Read
manufacturing workforce
Big DataData ManagementWorkforce Analytics

How Did Big Data Create a Modern Day Manufacturing Workforce?

4 Min Read

Join the real movers and shakers in Washington!

4 Min Read

NoSQL Buzz

1 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

AI and chatbots
Chatbots and SEO: How Can Chatbots Improve Your SEO Ranking?
Artificial Intelligence Chatbots Exclusive
ai chatbot
The Art of Conversation: Enhancing Chatbots with Advanced AI Prompts
Chatbots

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?