By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    data Analytics instagram stories
    Data Analytics Helps Marketers Make the Most of Instagram Stories
    15 Min Read
    analyst,women,looking,at,kpi,data,on,computer,screen
    What to Know Before Recruiting an Analyst to Handle Company Data
    6 Min Read
    AI analytics
    AI-Based Analytics Are Changing the Future of Credit Cards
    6 Min Read
    data overload showing data analytics
    How Does Next-Gen SIEM Prevent Data Overload For Security Analysts?
    8 Min Read
    hire a marketing agency with a background in data analytics
    5 Reasons to Hire a Marketing Agency that Knows Data Analytics
    7 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-23 SmartData Collective. All Rights Reserved.
Reading: HadoopDB discussion with Daniel Abadi
Share
Notification Show More
Aa
SmartData CollectiveSmartData Collective
Aa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Big Data > Data Warehousing > HadoopDB discussion with Daniel Abadi
Data Warehousing

HadoopDB discussion with Daniel Abadi

TonyBain
Last updated: 2009/07/25 at 5:28 PM
TonyBain
4 Min Read
SHARE


I spoke to Daniel Abadi a few days ago about his HadoopDB announcement that came out recently. I am sure this has been a busy time for Daniel and his team over in Yale as HadoopDB has been getting a lot of interest which I am sure will continue to build.

Some notes from our discussion:

  • HadoopDB is primarily focused on high scalability and the required availability at scale. Daniel questions current MPP’s ability to truly scale past 100 nodes whereas Hadoop has real examples on 3000+ nodes.
  • HadoopDB like many MPP analytical database platforms uses shared nothing relational database as processing units. HadoopDB uses Postgres. Unlike other MPP databases, HadoopDB uses Hadoop as the distributed mechanism.
  • I am ad libbing here, but I understand that Daniel doesn’t dispute DeWitt & Stonebrakers (and his) paper which claims Map/Reduce underperforms when compared to current MPP DBMS. HadoopDB, however, is focused on massive scale, hundreds or thousands of nodes.  Currently the largest MPP database we know of is 96 nodes.
  • Early benchmarking shows HadoopDB outperforms Hadoop but is slower than current MPP databases under normal circumstances. However, when …

More Read

using hadoop for email marketing scalability

Scalability-focused Email Marketing Solutions that Incorporate Hadoop

Hadoop Data Mining Tools Can Enhance The Value Of Digital Assets
How Big Data and Hadoop Training Programs Can Make a Big Difference
Big Data New Age: Hadoop vs Spark
How Hadoop Tools Shape SAP Hana’s Big Data Platform

I spoke to Daniel Abadi a few days ago about his HadoopDB announcement that came out recently. I am sure this has been a busy time for Daniel and his team over in Yale as HadoopDB has been getting a lot of interest which I am sure will continue to build.

Some notes from our discussion:

  • HadoopDB is primarily focused on high scalability and the required availability at scale. Daniel questions current MPP’s ability to truly scale past 100 nodes whereas Hadoop has real examples on 3000+ nodes.
  • HadoopDB like many MPP analytical database platforms uses shared nothing relational database as processing units. HadoopDB uses Postgres. Unlike other MPP databases, HadoopDB uses Hadoop as the distributed mechanism.
  • I am ad libbing here, but I understand that Daniel doesn’t dispute DeWitt & Stonebrakers (and his) paper which claims Map/Reduce underperforms when compared to current MPP DBMS. HadoopDB, however, is focused on massive scale, hundreds or thousands of nodes.  Currently the largest MPP database we know of is 96 nodes.
  • Early benchmarking shows HadoopDB outperforms Hadoop but is slower than current MPP databases under normal circumstances. However, when simulating node failure mid query HadoopDB outperformed current MPP databases significantly.
  • The higher the scalability the higher the possibility of node failure mid query. Very large Hadoop deployments may experience at least 1 node failure per query (job).
  • HadoopDB is usable today, but should not be considered an “out of the box” solution. HadoopDB is an outcome from a database research initiative, not a commercial venture.  Anyone planning to use HapoopDB will require the appropriate systems & development skills to effectively deploy.

HadoopDB is an innovative approach to the scalability challenges that continue to push the architecture of the modern database forward.

Related articles by Zemanta
  • Researchers Create Database-Hadoop Hybrid (tech.slashdot.org)
  • Yale researchers create database-Hadoop hybrid (computerworld.com)


Link to original post

TAGGED: hadoop
TonyBain July 25, 2009
Share This Article
Facebook Twitter Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

ai low code frameworks
AI Can Help Accelerate Development with Low-Code Frameworks
Artificial Intelligence
data Analytics instagram stories
Data Analytics Helps Marketers Make the Most of Instagram Stories
Analytics
data breaches
How Hospital Security Breaches Devastate Local Communities
Policy and Governance
analyst,women,looking,at,kpi,data,on,computer,screen
What to Know Before Recruiting an Analyst to Handle Company Data
Analytics

Stay Connected

1.2k Followers Like
33.7k Followers Follow
222 Followers Pin

You Might also Like

using hadoop for email marketing scalability
Hadoop

Scalability-focused Email Marketing Solutions that Incorporate Hadoop

6 Min Read
hadoop data mining tools
Software

Hadoop Data Mining Tools Can Enhance The Value Of Digital Assets

6 Min Read
big data and Hadoop guide
AnalyticsBig DataExclusiveHadoopSoftware

How Big Data and Hadoop Training Programs Can Make a Big Difference

5 Min Read
Hadoop vs Spark
Big DataHadoopMapReduceProgramming

Big Data New Age: Hadoop vs Spark

5 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

data-driven web design
5 Great Tips for Using Data Analytics for Website UX
Big Data
AI chatbots
AI Chatbots Can Help Retailers Convert Live Broadcast Viewers into Sales!
Chatbots

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Lost your password?