Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    composable analytics
    How Composable Analytics Unlocks Modular Agility for Data Teams
    9 Min Read
    data mining to find the right poly bag makers
    Using Data Analytics to Choose the Best Poly Mailer Bags
    12 Min Read
    data analytics for pharmacy trends
    How Data Analytics Is Tracking Trends in the Pharmacy Industry
    5 Min Read
    car expense data analytics
    Data Analytics for Smarter Vehicle Expense Management
    10 Min Read
    image fx (60)
    Data Analytics Driving the Modern E-commerce Warehouse
    13 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: Google and Apache Hadoop: A Match Made in the Cloud
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > IT > Cloud Computing > Google and Apache Hadoop: A Match Made in the Cloud
Big DataCloud ComputingData MiningData WarehousingHadoopITMapReduceOpen SourceSoftwareWorkforce Data

Google and Apache Hadoop: A Match Made in the Cloud

MicheleNemschoff
MicheleNemschoff
4 Min Read
Image
SHARE

ImageTo the uninitiated, words like “Google” and “Hadoop” sound like the stuff of a futuristic make-believe world. Being that the MapReduce paper published by Google scientists Jeffrey Dean and Sanjay Ghemawat in 2004 inspired Hadoop, the coming together of Hadoop and Google is a match made in the cloud.

ImageTo the uninitiated, words like “Google” and “Hadoop” sound like the stuff of a futuristic make-believe world. Being that the MapReduce paper published by Google scientists Jeffrey Dean and Sanjay Ghemawat in 2004 inspired Hadoop, the coming together of Hadoop and Google is a match made in the cloud. And the partnership between MapR and Google to run MapR’s Enterprise Distribution for Hadoop on Google Compute Engine is anything but science fiction. Here’s a look at some of the major benefits of using Hadoop on Google Compute Engine.

Flexibility

Running Hadoop on Google Compute Engine leverages the power and efficiency of Google’s data centers to execute at scale and solve large problems. Utilizing the Google Cloud Platform, enterprises have the flexibility to expand or contract the cluster size on demand to provision precisely the amount of resources required to meet their data processing needs.

More Read

Failing to Address Data Quality and Consistency – A Series of Unfortunate Data Warehousing/Business Intelligence Events
The Next Big Thing is REALLY BIG: Interactions Versus Transactions
The Argument For & Against Map/Reduce
Press Start to Learn: How Gamification Is Changing Education
Preparing Yourself to Move to Apache Spark

World-record speed and performance

With MapR’s Enterprise Distribution for Hadoop on Google Compute Engine, it’s possible to spin up well over a thousand servers in a matter of minutes and run scalable applications at blazing fast speeds. In fact, MapR ran Hadoop on the Google Compute Engine and set a world record for MinuteSort. MapR sorted 15 billion 100-byte records in only 60 seconds. It was done on 2,103 virtual instances, each consisting of four virtual cores and a virtual disk.

The Hadoop/Google virtualized cloud environment set the record using far fewer servers, disks and cores than Yahoo used in setting the prior record. To put it simply, Hadoop on Google Cloud Platform not only does more with less, it does so faster than the best and biggest on on-premise Big Data platforms. This type of performance allows enterprises to tackle large-scale workloads quickly and easily to gain greater business insights and competitive advantage to drive higher ROI.

Cost-effectiveness

According to MapR CEO John Schroeder, who discusses Hadoop and Google Compute Engine at Google I/O, the physical hardware that an enterprise would need to approximate what Yahoo used to achieve its 62-second benchmark would conservatively cost $6 million to acquire and several months to install. And those estimates, Schroeder explains, don’t even factor in the costs of all the electrical needed to handle the server load, not to mention the 50-75 tons of air conditioning that would be required to cool the data center. In contrast, Schroeder offers that the cost of running Hadoop on Google Compute Engine for the 54 seconds it took to set the new 1TB Terasort benchmark was a mere $16.

Utilizing Google as the cloud provider eliminates the need for enterprises to pay huge costs for on-premise servers that need to be switched out for newer models every 3 years and may never be used to full capacity. Enterprises only pay Google for the resources they use to meet their data processing demands. And the costs associated with running Enterprise Hadoop on Google Compute Engine are extremely reasonable compared to traditional infrastructure.  

In short, if you’re looking for a flexible, fast, and cost effective Big Data platform, MapR’s Hadoop distribution running on Google Compute Engine just might be the right solution for your business.

Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

student learning AI
Advanced Degrees Still Matter in an AI-Driven Job Market
Artificial Intelligence Exclusive
mobile device farm
How Mobile Device Farms Strengthen Big Data Workflows
Big Data Exclusive
composable analytics
How Composable Analytics Unlocks Modular Agility for Data Teams
Analytics Big Data Exclusive
fintech startups
Why Fintech Start-Ups Struggle To Secure The Funding They Need
Infographic News

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

Image
Big Data

The Future of Marketing: Combining Big Data with Intuition

5 Min Read
data analytics in email marketing
Big Data

Data-Driven Approaches for Email Marketing Automation in Your Business

9 Min Read
cyber security breach
ITSecurity

Recent Trends in Cyber Security Breach You Must Know

4 Min Read
big data creates the possibility of a cashless society
Big Data

Big Data Leads to the Possibility of a Cashless Society

9 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

AI and chatbots
Chatbots and SEO: How Can Chatbots Improve Your SEO Ranking?
Artificial Intelligence Chatbots Exclusive
giveaway chatbots
How To Get An Award Winning Giveaway Bot
Big Data Chatbots Exclusive

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?