Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    data analytics
    How Data Analytics Can Help You Construct A Financial Weather Map
    4 Min Read
    financial analytics
    Financial Analytics Shows The Hidden Cost Of Not Switching Systems
    4 Min Read
    warehouse accidents
    Data Analytics and the Future of Warehouse Safety
    10 Min Read
    stock investing and data analytics
    How Data Analytics Supports Smarter Stock Trading Strategies
    4 Min Read
    predictive analytics risk management
    How Predictive Analytics Is Redefining Risk Management Across Industries
    7 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: Virtualization, Federation or Just Plain Access
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Business Intelligence > Virtualization, Federation or Just Plain Access
Business Intelligence

Virtualization, Federation or Just Plain Access

Barry Devlin
Barry Devlin
7 Min Read
SHARE

Virtualization-Federation.pngThere are still many illusions and unjustified expectations about big data.  But, one old belief–dating back to the early days of data warehousing–that it has shattered is in a single store that can serve all BI needs.  Given the volumes an

Virtualization-Federation.pngThere are still many illusions and unjustified expectations about big data.  But, one old belief–dating back to the early days of data warehousing–that it has shattered is in a single store that can serve all BI needs.  Given the volumes and variety of big data, any thought of routing it all through a relational database environment just doesn’t make sense.  And after the market’s brief flirtation with the idea that all data could be handled in Hadoop (doh!), there is a general belief that IT needs to provide some sort of over-arching, integrating view for users across multiple data stores.

Cirro is among the latest players in this field, as I discovered talking to CEO Mark Theissen, previously data warehousing technical lead at Microsoft and a veteran of DATAllegro and Brio.  Its basic value proposition is to offer users self-driven exploration–via Cirro’s Excel plug-in and BI tools–of data across a wide variety of platforms via ad hoc federation.  Cirro’s starting point is big data scale and performance, offering a data hub with a cost-based federation optimizer, smart caching and a function library of low level MapReduce and SQL functions.  It also offers an optional “multi store” consisting of Hadoop and MySQL components that can be used as a temporary scratchpad area or a data mart.

In our conversation, Theissen declared that Cirro does federation, whereas competitors like Composite and Denodo do virtualization.  The difference, in his view, is that virtualization involves an expensive and time-consuming phase to create a semantic layer, while federation is done on the fly and, in the case of Cirro, using existing metadata from BI tools, databases and so on.  I wish it were that simple to differentiate between these two phrases, which have become a marketing battleground for many of the vendors competing in this field from the majors like IBM and Informatica to the newcomers such as Karmasphere and ClearStory.

More Read

Online Business Model Innovation
Predictive Analytics, an Historical Perspective: Interview with…
Myth or Fact : The Diminishing Marginal Returns of Variable Creation in Data Mining Solutions.
Beyond the Bottomline: ERP Tools Are Meant for Ensuring Customer Satisfaction
VectorWise

I’d like to try to clarify the two terms… again.

The concept of federation (in data) goes back to the mid-1980s with the concept of federating SQL queries against the then-emerging relational databases.  By 1991, IBM’s Information Warehouse Framework included access to heterogeneous databases via EDA/SQL from Information Builders.  By the early years of the new millennium, the need to join data from multiple, heterogeneous sources beyond traditional databases was widespread, often described as enterprise information integration (EII).  But, vendor offerings were poorly received, especially in BI, because of concerns about mismatched data meanings, security and query performance.  I consider federation as the basic technology of being able to split up a query in real time into component parts, distribute it to heterogeneous, autonomous sources and retrieve and combine the results.  To do this, access to technical metadata that defines database (or file) locations and structures, data volumes, network performance and more is needed to enable query optimization for access and performance.

Data virtualization, in my view, builds on top of federation with knowledge of the business-related metadata required to address the problem of disparate data meanings, relationships and currencies and deliver high quality results that are meaningful and consistent for the business user submitting the query.  Simply put, there are two ways to address these problems and supply the needed metadata.  The easiest approach is to depend on the business user to understand data consistency and similar quasi-IT issues and to make sensible (in terms of data coherence and reliable results) queries.  The second way is to model the data to some extent upfront and create a semantic layer, as it’s often called, that ensures the quality of returned results.

The former approach typically leads to faster, cheaper implementations; the latter to longer-term quality at some upfront cost.  The former works better if you’re coming from a big data view point, where much of the data is poorly defined, changing and of questionable accuracy and consistency in any case.  The latter favors enterprise information management where quality and consistency are key.  The reality of today’s world, however, is that we need both!
Cirro, with its sights set on big data and its minimal formal structure, strongly favors the first approach.  Allowing, indeed encouraging, users to build their explorations in the freeform environment that is Excel is a strong statement in itself.  It’s typically fast, easy and iterative, all highly valued qualities in today’s break-neck speed business environment.  However, when you link from there to the (hopefully) high-quality data warehouse, the need for a more formal and modeled approach becomes clear.  

So, which approach to choose?  It depends on your starting point and initial drivers.  And your long-term needs.  Composite, for example, focuses more on the prior creation of business views to shield users from the technical complexity and inconsistencies in typical enterprise data.  Denodo, in contrast, talks of both bottom-up and top-down modeling to address both sets of needs.  In the long run, you’ll probably need both approaches: the speed of an ad hoc approach for sandboxing and the quality of semantic modeling for production integration.

TAGGED:data federationdata virtualizationDenodo
Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

data analytics
How Data Analytics Can Help You Construct A Financial Weather Map
Analytics Exclusive Infographic
AI use in payment methods
AI Shows How Payment Delays Disrupt Your Business
Artificial Intelligence Exclusive Infographic
financial analytics
Financial Analytics Shows The Hidden Cost Of Not Switching Systems
Analytics Exclusive Infographic
multi model ai
How Teams Using Multi-Model AI Reduced Risk Without Slowing Innovation
Artificial Intelligence Exclusive

Stay Connected

1.2KFollowersLike
33.7KFollowersFollow
222FollowersPin

You Might also Like

Data Virtualization
Big DataData ManagementData VisualizationExclusiveITPrivacySecurity

Understanding the Different Forms of Data Virtualization

7 Min Read
Image
Business IntelligenceExclusive

Big Data converging Data and Content

5 Min Read

Consolidation in the Social Software Market Continues: VMware Acquires SocialCast

5 Min Read
Image
Big Data

The Case for Using Data Virtualization for Big Data Analytics

6 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

data-driven web design
5 Great Tips for Using Data Analytics for Website UX
Big Data
ai in ecommerce
Artificial Intelligence for eCommerce: A Closer Look
Artificial Intelligence

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?