Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    image fx (67)
    Improving LinkedIn Ad Strategies with Data Analytics
    9 Min Read
    big data and remote work
    Data Helps Speech-Language Pathologists Deliver Better Results
    6 Min Read
    data driven insights
    How Data-Driven Insights Are Addressing Gaps in Patient Communication and Equity
    8 Min Read
    pexels pavel danilyuk 8112119
    Data Analytics Is Revolutionizing Medical Credentialing
    8 Min Read
    data and seo
    Maximize SEO Success with Powerful Data Analytics Insights
    8 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: Virtualization, Federation or Just Plain Access
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Business Intelligence > Virtualization, Federation or Just Plain Access
Business Intelligence

Virtualization, Federation or Just Plain Access

Barry Devlin
Barry Devlin
7 Min Read
SHARE

Virtualization-Federation.pngThere are still many illusions and unjustified expectations about big data.  But, one old belief–dating back to the early days of data warehousing–that it has shattered is in a single store that can serve all BI needs.  Given the volumes an

Virtualization-Federation.pngThere are still many illusions and unjustified expectations about big data.  But, one old belief–dating back to the early days of data warehousing–that it has shattered is in a single store that can serve all BI needs.  Given the volumes and variety of big data, any thought of routing it all through a relational database environment just doesn’t make sense.  And after the market’s brief flirtation with the idea that all data could be handled in Hadoop (doh!), there is a general belief that IT needs to provide some sort of over-arching, integrating view for users across multiple data stores.

Cirro is among the latest players in this field, as I discovered talking to CEO Mark Theissen, previously data warehousing technical lead at Microsoft and a veteran of DATAllegro and Brio.  Its basic value proposition is to offer users self-driven exploration–via Cirro’s Excel plug-in and BI tools–of data across a wide variety of platforms via ad hoc federation.  Cirro’s starting point is big data scale and performance, offering a data hub with a cost-based federation optimizer, smart caching and a function library of low level MapReduce and SQL functions.  It also offers an optional “multi store” consisting of Hadoop and MySQL components that can be used as a temporary scratchpad area or a data mart.

In our conversation, Theissen declared that Cirro does federation, whereas competitors like Composite and Denodo do virtualization.  The difference, in his view, is that virtualization involves an expensive and time-consuming phase to create a semantic layer, while federation is done on the fly and, in the case of Cirro, using existing metadata from BI tools, databases and so on.  I wish it were that simple to differentiate between these two phrases, which have become a marketing battleground for many of the vendors competing in this field from the majors like IBM and Informatica to the newcomers such as Karmasphere and ClearStory.

More Read

Webinar on the ROI of business rules in decision management
Can AI Help Create an Ideal Employee Compensation Package?
Great Examples of US Government BI Transparency
Comparing Costs of Different Cloud Computing Providers
AI-Driven Discord Bots Can Track Server Stats

I’d like to try to clarify the two terms… again.

The concept of federation (in data) goes back to the mid-1980s with the concept of federating SQL queries against the then-emerging relational databases.  By 1991, IBM’s Information Warehouse Framework included access to heterogeneous databases via EDA/SQL from Information Builders.  By the early years of the new millennium, the need to join data from multiple, heterogeneous sources beyond traditional databases was widespread, often described as enterprise information integration (EII).  But, vendor offerings were poorly received, especially in BI, because of concerns about mismatched data meanings, security and query performance.  I consider federation as the basic technology of being able to split up a query in real time into component parts, distribute it to heterogeneous, autonomous sources and retrieve and combine the results.  To do this, access to technical metadata that defines database (or file) locations and structures, data volumes, network performance and more is needed to enable query optimization for access and performance.

Data virtualization, in my view, builds on top of federation with knowledge of the business-related metadata required to address the problem of disparate data meanings, relationships and currencies and deliver high quality results that are meaningful and consistent for the business user submitting the query.  Simply put, there are two ways to address these problems and supply the needed metadata.  The easiest approach is to depend on the business user to understand data consistency and similar quasi-IT issues and to make sensible (in terms of data coherence and reliable results) queries.  The second way is to model the data to some extent upfront and create a semantic layer, as it’s often called, that ensures the quality of returned results.

The former approach typically leads to faster, cheaper implementations; the latter to longer-term quality at some upfront cost.  The former works better if you’re coming from a big data view point, where much of the data is poorly defined, changing and of questionable accuracy and consistency in any case.  The latter favors enterprise information management where quality and consistency are key.  The reality of today’s world, however, is that we need both!
Cirro, with its sights set on big data and its minimal formal structure, strongly favors the first approach.  Allowing, indeed encouraging, users to build their explorations in the freeform environment that is Excel is a strong statement in itself.  It’s typically fast, easy and iterative, all highly valued qualities in today’s break-neck speed business environment.  However, when you link from there to the (hopefully) high-quality data warehouse, the need for a more formal and modeled approach becomes clear.  

So, which approach to choose?  It depends on your starting point and initial drivers.  And your long-term needs.  Composite, for example, focuses more on the prior creation of business views to shield users from the technical complexity and inconsistencies in typical enterprise data.  Denodo, in contrast, talks of both bottom-up and top-down modeling to address both sets of needs.  In the long run, you’ll probably need both approaches: the speed of an ad hoc approach for sandboxing and the quality of semantic modeling for production integration.

TAGGED:data federationdata virtualizationDenodo
Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

image fx (2)
Monitoring Data Without Turning into Big Brother
Big Data Exclusive
image fx (71)
The Power of AI for Personalization in Email
Artificial Intelligence Exclusive Marketing
image fx (67)
Improving LinkedIn Ad Strategies with Data Analytics
Analytics Big Data Exclusive Software
big data and remote work
Data Helps Speech-Language Pathologists Deliver Better Results
Analytics Big Data Exclusive

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

data science jobs
Jobs

Is Virtual Networking The Key to Landing a Data Science Job?

5 Min Read
Data Virtualization
Big DataData ManagementData VisualizationExclusiveITPrivacySecurity

Understanding the Different Forms of Data Virtualization

7 Min Read

Consolidation in the Social Software Market Continues: VMware Acquires SocialCast

5 Min Read
Image
Business IntelligenceExclusive

Big Data converging Data and Content

5 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

giveaway chatbots
How To Get An Award Winning Giveaway Bot
Big Data Chatbots Exclusive
ai is improving the safety of cars
From Bolts to Bots: How AI Is Fortifying the Automotive Industry
Artificial Intelligence

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?