Public vs. Private Cloud: How to Integrate Your Data Across Both

December 30, 2015
127 Views

Private clouds are a natural extension of the virtualization revolution of the late 1990s and early 2000s, and they give organizations the ability to quickly create virtual machine environments— whether they are running vSphere, OpenStack, CloudStack or some other technology. Ultimately, though, that private cloud is based on capital expense of bare metal hardware in a data center you are responsible for.

Private clouds are a natural extension of the virtualization revolution of the late 1990s and early 2000s, and they give organizations the ability to quickly create virtual machine environments— whether they are running vSphere, OpenStack, CloudStack or some other technology. Ultimately, though, that private cloud is based on capital expense of bare metal hardware in a data center you are responsible for.

Public clouds—including Amazon Web Services, Microsoft Azure, Google Cloud Platform and other IaaS market players—enable an organization to lease virtual machines across the Internet for hours or even minutes at a time. Utilizing this pay-as-you-go model can be especially helpful for workloads with unpredictable demands so that an organization can handle peaks without the underutilized capacity that would otherwise come during slower periods.

Many organizations are opting for a hybrid approach, using private cloud in certain situations and public cloud in others. A key consideration with such a strategy has to do with handling data that might have to be spread over multiple physical locations across the public Internet, and the latency as well as security concerns that might arise.

Everybody Has Everything: Cross Internet Master/Master Replication

The simplest approach to tackling data replication across a public and private or multiple public clouds is the same solution used for exclusively internal use cases: master/master replication. Keeping data replicated and symphonized across multiple locations ensures data integrity. The nuance here is now replication traffic is running across the public Internet and requires additional security measures.

Replication latency is a strong consideration here as well. Depending upon how an application is using data, it may need to proceed with caution given that replication data is now being carried over much larger distances.

Single Version of the Truth Approach

Alternatively, should data gravity issues prevent a master/master replication approach, data may have to reside in a single place while multiple front ends access it from wherever they might be running. Security issues remain the same and can benefit from a more formally structured REST API sitting in front of the single data source, but replication latency concerns get replaced by transactional latency between the data source and the consuming application layer. Those transactional concerns can often be mitigated with creative caching approaches on the consumption end so that requests for data back at the single source can be minimized.

Avoiding Hybrid Cloud Data Issues with Workload Placement Guidelines

Another way to avoid data issues across public and private clouds is to simply choose one or another based on workload type and not have any particular workload straddle both. Some workloads have steady demand or sensitive data, which makes them better suited for the firewalled, fixed capacity confines of a private cloud. Financial analytics and Human Resources workloads are good examples.

Other workloads see wide variations in demand and have publicly viewable data that make them a great fit for the elasticity of the public cloud. A customer-facing marketing website or customer analytics that have been sanitized to remove Personally Identifiable Information are typical candidates.

So, instead of choosing both for a particular application, establish guidelines for your entire portfolio of applications and decide to run each individual application on one or the other depending upon demand variability and data sensitivity.

The Choice Is Yours

Every organization has its unique challenges, strengths and key performance indicators that make no single choice the right one for everyone. Some applications should be deployed across multiple clouds in a hybrid fashion. Replicating data using long tested master/master methods can be successful in such situations when security concerns are met. Single data source techniques can also prove useful, especially when establishing a REST-API in front of them and using data caching techniques. An equally valid approach is to opt against single application hybrids and instead choose guidelines for how demand and data sensitivity thresholds dictate which applications get deployed where.

You may be interested

5 Challenges Your Company Has to Overcome to Succeed in Data Mining
Big Data
47 shares509 views
Big Data
47 shares509 views

5 Challenges Your Company Has to Overcome to Succeed in Data Mining

JennyRichards - June 28, 2017

Data lakes are failing and fast. They are not able to support the real time-to-market requirements of the new big…

The Role Big Data Plays in Influencer Marketing
Big Data
69 shares2,241 views
Big Data
69 shares2,241 views

The Role Big Data Plays in Influencer Marketing

Ryan Kh - June 26, 2017

Influencer marketing is the leading way in which brands can get noticed. Big data is also one of the leading…

Experts Debate: Is Big Data a Boon or Risk for Actuaries?
Big Data
49 shares1,145 views
Big Data
49 shares1,145 views

Experts Debate: Is Big Data a Boon or Risk for Actuaries?

Annie Qureshi - June 26, 2017

Banks, insurance companies and other firms in the financial sector have relied extensively on big data for decades. Over the…