Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    sales and data analytics
    How Data Analytics Improves Lead Management and Sales Results
    9 Min Read
    data analytics and truck accident claims
    How Data Analytics Reduces Truck Accidents and Speeds Up Claims
    7 Min Read
    predictive analytics for interior designers
    Interior Designers Boost Profits with Predictive Analytics
    8 Min Read
    image fx (67)
    Improving LinkedIn Ad Strategies with Data Analytics
    9 Min Read
    big data and remote work
    Data Helps Speech-Language Pathologists Deliver Better Results
    6 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: Data Integration Is the Schema in Between
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Big Data > Data Integration Is the Schema in Between
Big Data

Data Integration Is the Schema in Between

MIKE20
MIKE20
3 Min Read
Image
SHARE

ImageThe third of the five biggest data myths debunked by Gartner is big data technology will eliminate the need for data integration.

ImageThe third of the five biggest data myths debunked by Gartner is big data technology will eliminate the need for data integration. The truth is big data technology excels at data acquisition, not data integration.

This myth is rooted in what Gartner referred to as the schema on read approach used by big data technology to quickly acquire a variety of data from sources with multiple data formats.

This is best exemplified by the Hadoop Distributed File System (HDFS). Unlike the predefined, and therefore predictably structured, data formats required by relational databases, HDFS is schema-less. It just stores data files, and those data files can be in just about any format. Gartner explained that “many people believe this flexibility will enable end users to determine how to interpret any data asset on demand. It will also, they believe, provide data access tailored to individual users.”

More Read

drupal content management system
Drupal: Open Source CMS for Data-Driven Businesses
Linux VPS Management Skills for Data Scientists
LinkedIn Apps : Blogging and Twitter
Analyzing Logs and More – A Big Data Architecture
Big Data SQL 3.0 Bridges Multiple Data Platforms Like Never Before

While it was a great innovation to make data acquisition schema-less, more work has to be done to develop information because, as Gartner explained, “most information users rely significantly on schema on write scenarios in which data is described, content is prescribed, and there is agreement about the integrity of data and how it relates to the scenarios.”

It has always been true that whenever you acquire data in various formats, it has to be transformed into a common format before it can be further processed and put to use. After schema on read and before schema on write is the schema in between.

Data integration is the schema in between. It always has been. Big data technology has not changed this because, as I have previously blogged, data stored in HDFS is not automatically integrated. And it’s not just Hadoop. Data integration is not a natural by-product of any big data technology, which is one of the reasons why technology is only one aspect of a big data solution.

Just as it has always been, in between data acquisition and data usage there’s a lot that has to happen. Not just data integration, but data quality and data governance too. Big data technology doesn’t magically make any of these things happen. In fact, big data just makes us even more painfully aware there’s no magic behind data management’s curtain, just a lot of hard work.

TAGGED:data integration
Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

sales and data analytics
How Data Analytics Improves Lead Management and Sales Results
Analytics Big Data Exclusive
ai in marketing
How AI and Smart Platforms Improve Email Marketing
Artificial Intelligence Exclusive Marketing
AI Document Verification for Legal Firms: Importance & Top Tools
AI Document Verification for Legal Firms: Importance & Top Tools
Artificial Intelligence Exclusive
AI supply chain
AI Tools Are Strengthening Global Supply Chains
Artificial Intelligence Exclusive

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

Big Data Research
Best PracticesBig Data

Why You Need A Methodology For Your Big Data Research

6 Min Read

Referential Treatment – The Open Source Reference Data Trend

6 Min Read

Solving your application and data integration challenges

3 Min Read

Big Data Without Integration Is Broken

7 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

AI and chatbots
Chatbots and SEO: How Can Chatbots Improve Your SEO Ranking?
Artificial Intelligence Chatbots Exclusive
ai chatbot
The Art of Conversation: Enhancing Chatbots with Advanced AI Prompts
Chatbots

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?