Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    predictive analytics risk management
    How Predictive Analytics Is Redefining Risk Management Across Industries
    7 Min Read
    data analytics and gold trading
    Data Analytics and the New Era of Gold Trading
    9 Min Read
    composable analytics
    How Composable Analytics Unlocks Modular Agility for Data Teams
    9 Min Read
    data mining to find the right poly bag makers
    Using Data Analytics to Choose the Best Poly Mailer Bags
    12 Min Read
    data analytics for pharmacy trends
    How Data Analytics Is Tracking Trends in the Pharmacy Industry
    5 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: Google and Apache Hadoop: A Match Made in the Cloud
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > IT > Cloud Computing > Google and Apache Hadoop: A Match Made in the Cloud
Big DataCloud ComputingData MiningData WarehousingHadoopITMapReduceOpen SourceSoftwareWorkforce Data

Google and Apache Hadoop: A Match Made in the Cloud

MicheleNemschoff
MicheleNemschoff
4 Min Read
Image
SHARE

ImageTo the uninitiated, words like “Google” and “Hadoop” sound like the stuff of a futuristic make-believe world. Being that the MapReduce paper published by Google scientists Jeffrey Dean and Sanjay Ghemawat in 2004 inspired Hadoop, the coming together of Hadoop and Google is a match made in the cloud.

ImageTo the uninitiated, words like “Google” and “Hadoop” sound like the stuff of a futuristic make-believe world. Being that the MapReduce paper published by Google scientists Jeffrey Dean and Sanjay Ghemawat in 2004 inspired Hadoop, the coming together of Hadoop and Google is a match made in the cloud. And the partnership between MapR and Google to run MapR’s Enterprise Distribution for Hadoop on Google Compute Engine is anything but science fiction. Here’s a look at some of the major benefits of using Hadoop on Google Compute Engine.

Flexibility

Running Hadoop on Google Compute Engine leverages the power and efficiency of Google’s data centers to execute at scale and solve large problems. Utilizing the Google Cloud Platform, enterprises have the flexibility to expand or contract the cluster size on demand to provision precisely the amount of resources required to meet their data processing needs.

More Read

General Purpose Sensemaking Systems and Information Colocation
How Companies Are Rethinking Promotional Materials In Light Of Big Data
America’s Favorite Pastime is Having a Data-Driven Renaissance
DQ Certification a Noble Cause
The Nature of Big Data and the Skills of Data Scientists

World-record speed and performance

With MapR’s Enterprise Distribution for Hadoop on Google Compute Engine, it’s possible to spin up well over a thousand servers in a matter of minutes and run scalable applications at blazing fast speeds. In fact, MapR ran Hadoop on the Google Compute Engine and set a world record for MinuteSort. MapR sorted 15 billion 100-byte records in only 60 seconds. It was done on 2,103 virtual instances, each consisting of four virtual cores and a virtual disk.

The Hadoop/Google virtualized cloud environment set the record using far fewer servers, disks and cores than Yahoo used in setting the prior record. To put it simply, Hadoop on Google Cloud Platform not only does more with less, it does so faster than the best and biggest on on-premise Big Data platforms. This type of performance allows enterprises to tackle large-scale workloads quickly and easily to gain greater business insights and competitive advantage to drive higher ROI.

Cost-effectiveness

According to MapR CEO John Schroeder, who discusses Hadoop and Google Compute Engine at Google I/O, the physical hardware that an enterprise would need to approximate what Yahoo used to achieve its 62-second benchmark would conservatively cost $6 million to acquire and several months to install. And those estimates, Schroeder explains, don’t even factor in the costs of all the electrical needed to handle the server load, not to mention the 50-75 tons of air conditioning that would be required to cool the data center. In contrast, Schroeder offers that the cost of running Hadoop on Google Compute Engine for the 54 seconds it took to set the new 1TB Terasort benchmark was a mere $16.

Utilizing Google as the cloud provider eliminates the need for enterprises to pay huge costs for on-premise servers that need to be switched out for newer models every 3 years and may never be used to full capacity. Enterprises only pay Google for the resources they use to meet their data processing demands. And the costs associated with running Enterprise Hadoop on Google Compute Engine are extremely reasonable compared to traditional infrastructure.  

In short, if you’re looking for a flexible, fast, and cost effective Big Data platform, MapR’s Hadoop distribution running on Google Compute Engine just might be the right solution for your business.

Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

street address database
Why Data-Driven Companies Rely on Accurate Street Address Databases
Big Data Exclusive
predictive analytics risk management
How Predictive Analytics Is Redefining Risk Management Across Industries
Analytics Exclusive Predictive Analytics
data analytics and gold trading
Data Analytics and the New Era of Gold Trading
Analytics Big Data Exclusive
student learning AI
Advanced Degrees Still Matter in an AI-Driven Job Market
Artificial Intelligence Exclusive

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

Interview: Dr Graham Williams

13 Min Read

ggplot2 for Big Data

4 Min Read

US Postal Service Facing Cloud Compliance Issues

2 Min Read
Disaster Recovery
Data ManagementIT

The Importance of Setting Your Business up With a Disaster Recovery Plan

4 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

data-driven web design
5 Great Tips for Using Data Analytics for Website UX
Big Data
AI and chatbots
Chatbots and SEO: How Can Chatbots Improve Your SEO Ranking?
Artificial Intelligence Chatbots Exclusive

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?