By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData Collective
  • Analytics
    AnalyticsShow More
    data-driven image seo
    Data Analytics Helps Marketers Substantially Boost Image SEO
    8 Min Read
    construction analytics
    5 Benefits of Analytics to Manage Commercial Construction
    5 Min Read
    benefits of data analytics for financial industry
    Fascinating Changes Data Analytics Brings to Finance
    7 Min Read
    analyzing big data for its quality and value
    Use this Strategic Approach to Maximize Your Data’s Value
    6 Min Read
    data-driven seo for product pages
    6 Tips for Using Data Analytics for Product Page SEO
    11 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-23 SmartData Collective. All Rights Reserved.
Reading: Google Teh Evil? Cloud economics, BigTable + GFS vs. EU privacy laws
Share
Notification Show More
Latest News
anti-spoofing tips
Anti-Spoofing is Crucial for Data-Driven Businesses
Security
ai in software development
3 AI-Based Strategies to Develop Software in Uncertain Times
Software
ai in ppc advertising
5 Proven Tips for Utilizing AI with PPC Advertising in 2023
Artificial Intelligence
data-driven image seo
Data Analytics Helps Marketers Substantially Boost Image SEO
Analytics
ai in web design
5 Ways AI Technology Has Disrupted Website Development
Artificial Intelligence
Aa
SmartData Collective
Aa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Business Intelligence > CRM > Google Teh Evil? Cloud economics, BigTable + GFS vs. EU privacy laws
CRM

Google Teh Evil? Cloud economics, BigTable + GFS vs. EU privacy laws

Editor SDC
Last updated: 2009/02/05 at 5:28 PM
Editor SDC
13 Min Read
SHARE
- Advertisement -

At the recent Google IO conference, Google Fellow Jeff Dean gave a talk about the “inner workings” of Google’s date centres. There was a writeup on C-Net — the talk seems to (re)use some material from the deck available here. This is fascinating stuff. Buried in this stream of data (in the C-Net piece, the money shot, for my purposes here, is literally in the last paragraph) is information about the nature of the architecture of GFS (Google File System) and BigTable, the technologies used by Google to store (and retrieve) data scalably. Arguably, the combination of GFS and BigTable is Google’s cloud computing offering — GAE (Google App Engine) is just one stack that might run on top of it.

There’s an aspect of this architecture that didn’t get a lot of press, and doesn’t seem to have registered with a larger audience yet, and I think it should — if for no other reason than I think it hands Amazon a big fat advantage in European (and possibly Asian) cloud computing markets.

- Advertisement -

In Dean’s slides, there’s the following statement:

Scheduling system + GFS + BigTable + MapReduce work well within single clusters

More Read

data lineage tool

7 Data Lineage Tool Tips For Preventing Human Error in Data Processing

5 Types of Business Technology Every Entrepreneur Should be Using
CRM’s Have a Big Data Technical Debt Problem: Here’s How to Fix It
What Data-Driven Marketers Must Know About Salesforce & CRM
Call Center Improvement Strategies that Work: 4 Ways to use Data And Win

Followed by:

Truly global systems to span all our datacenters •…

- Advertisement -

At the recent Google IO conference, Google Fellow Jeff Dean gave a talk about the “inner workings” of Google’s date centres. There was a writeup on C-Net — the talk seems to (re)use some material from the deck available here. This is fascinating stuff. Buried in this stream of data (in the C-Net piece, the money shot, for my purposes here, is literally in the last paragraph) is information about the nature of the architecture of GFS (Google File System) and BigTable, the technologies used by Google to store (and retrieve) data scalably. Arguably, the combination of GFS and BigTable is Google’s cloud computing offering — GAE (Google App Engine) is just one stack that might run on top of it.

There’s an aspect of this architecture that didn’t get a lot of press, and doesn’t seem to have registered with a larger audience yet, and I think it should — if for no other reason than I think it hands Amazon a big fat advantage in European (and possibly Asian) cloud computing markets.

In Dean’s slides, there’s the following statement:

Scheduling system + GFS + BigTable + MapReduce work well within single clusters

- Advertisement -

Followed by:

Truly global systems to span all our datacenters • Global namespace with many replicas of data worldwide

In the C-Net article, he’s quoted as saying:

“We want our next-generation infrastructure to be a system that runs across a large fraction of our machines rather than separate instances,” Dean said.

Right now some massive file systems have different names–GFS/Oregon and GFS/Atlanta, for example–but they’re meant to be copies of each other. “We want a single namespace,” he said.

- Advertisement -

So what’s the big deal with that? Well, in a nutshell, European law versus U.S. law. Wildly different understandings of privacy and data protection, coupled with even more wildly different attitudes about government powers, result in a situation ripe for conflict. Things like the Patriot Act have resulted in a situation where European organisations simply categorically forbid any storage of data in the United States — and note, for further splitting of hairs, that it’s unclear what the “storage” of data really means, and it may be broad enough to include processing of data within U.S. jurisdictions, even if it’s persisted elsewhere.

There are laws on the books in several European countries (like Germany, where I live) that literally forbid situations like the ones that seem likely to result from Google’s architecture. Now, IANL (I am not a lawyer), and it’s possible that I am completely wrong for that reason alone. Perhaps a nuanced reading of European privacy and data protection laws simply makes the apparent problem go away. There’s sure one hell of a lot of documentation to read on the subject, more than enough to keep any number of lawyers busy for years, as a quick glance at some of the links I’ve been squirrelling away on the topic should make evident. When I was at the Enterprise 2.0 conference in Boston in June, there was a session called “An Evening in the Clouds”, and during a Q&A session at the end, I asked Google’s Jeff Keltner about the issue directly. He kind of dodged the question with a corp-speak “We’re evaluating that” answer (and, in all fairness, I would have done the same thing, in his place), but then went on to suggest that, in fact, I might be overstating the problem. He suggested that a lot of the worries he’s encountered from European customers wound up being FUD (fear, uncertainty and doubt) about things like the Patriot Act, and that when the lawyers all sat down together and really examined the problems, lots of them just went away.

Maybe.

But I think this argument is underestimating a significant aversion to risk in enterprisey organisations. Lots of, if not most, buying decisions never make it to a desk in the legal department — they get made long before that, in the chain of the buying process, during decisions about who’s in the running and so forth. And the fact is, in many large organisations, the aversion to risk is the converse of the popularity of the “path of least resistance” strategy. If there’s even a suggestion that going with Google for cloud services might not be the path of least resistance (because, say, it’s merely unclear what all the legal ramifications might be), that will often be enough to skew the decision making process against them. And when I see things like this, I see a case in point. The bottom line is, there are laws on the books in the EU that stand in direct conflict with the needs of Google’s architecture, and no amount of hand waving will make that fact go away.

On the other hand, it’s possible that Google’s architecture can be adapted to allow for a more nuanced implementation. On those same slides from Jeff Dean, we also see this statement:

- Advertisement -

Users specify high-level desires: “Store this data on at least 2 disks in EU, 2 in U.S. & 1 in Asia”

An API that could do that implies an API that could also be used to store the data only on “disks” in a particular region for regulatory, rather than performance reasons. But again, it’s not clear that storage alone is the problem, and Google’s ambition of achieving a global namespace implies that data will flit back and forth; sometimes on U.S. hardware, sometimes not. Since it’s not clear if that’s in scope, with regard to existing EU laws, we’re back to considering the path of least resistance problem.

Amazon seems to be in a better position on this. Their architecture is not reliant on increasing the degree of globalisation in the same way that Google’s seems to be. Thus, they have no difficulties adapting to the current state, and that is what they are doing with their European operations, which are currently limited to S3, but which will supposedly be expanded to include EC2 and other services “real soon now”.

James Urquhart came up with a fascinating meme related to this issue, which he calls “follow the law” computing (and make sure to follow some of the links from James’ blog as well). The basic idea is that software would become aware of these issues, and be cleverly partitioned to delegate processing (and, presumably, storage) to the legal jurisdiction that provides the most favourable environment for it. That’s a brilliant idea, essentially the flip side of what I’ve been musing about here — for certain transactions, it may be economically ideal to conduct them in a particular jurisdiction (say, the Cayman Islands), and therefore, the software would be partitioned to do just that. My imagination runs wild with that idea — consider the possibilities that open up: market forces would come into play on the keepers of legal jurisdictions (typically, countries). Jurisdictions could find themselves competing to provide the most favourable environments — that already happens, of course, but software like this would dramatically accelerate the effect (similar to the way automation changed currency trading markets). The mind boggles at the implications.

These are complex issues, and there are no clear cut answers to some of these things (in other words, stuff for lawyers to do). Having said that, I do think there’s cause for concern. At work, I was talking about this whole topic recently with somebody, and the general tenor of the conversation was something like “Google vs. the EU? Google will lose. MSFT did, after all”. And we had a chuckle full of Schadenfreude at Google’s expense. “But,” I said, in a tone not free of sarcasm, “if there is a company capable of changing the world to fit its architecture, rather than the other way round, then surely it’s Goog.” The next day, this showed up in the New York Times, and the last paragraph (again!) almost had me spew my tea onto my PowerBook:

In addition, businesses that operate on both sides of the Atlantic are pushing to make sure they are not caught between conflicting legal obligations.

“This will require compromise,” said Peter Fleischer, the global privacy counsel for Google. “It will require people to agree on a framework that balances two conflicting issues: privacy and security. But the need to develop that kind of framework is becoming more important as more data moves onto the Internet and circles across the global architecture.”

Indeed. Why do I get the sneaking feeling that Google, doer of good, is not putting my interests above its own in this?

Editor SDC February 5, 2009
Share this Article
Facebook Twitter Pinterest LinkedIn
Share
- Advertisement -

Follow us on Facebook

Latest News

anti-spoofing tips
Anti-Spoofing is Crucial for Data-Driven Businesses
Security
ai in software development
3 AI-Based Strategies to Develop Software in Uncertain Times
Software
ai in ppc advertising
5 Proven Tips for Utilizing AI with PPC Advertising in 2023
Artificial Intelligence
data-driven image seo
Data Analytics Helps Marketers Substantially Boost Image SEO
Analytics

Stay Connected

1.2k Followers Like
33.7k Followers Follow
222 Followers Pin

You Might also Like

data lineage tool
Big Data

7 Data Lineage Tool Tips For Preventing Human Error in Data Processing

6 Min Read
5 Types of Business Technology Every Entrepreneur Should be Using
CRMMarketing AutomationSocial mediaSoftware

5 Types of Business Technology Every Entrepreneur Should be Using

5 Min Read
CRM

CRM’s Have a Big Data Technical Debt Problem: Here’s How to Fix It

8 Min Read
big data marketing crm and salesofrce automation system
Big Data

What Data-Driven Marketers Must Know About Salesforce & CRM

9 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

ai in ecommerce
Artificial Intelligence for eCommerce: A Closer Look
Artificial Intelligence
AI chatbots
AI Chatbots Can Help Retailers Convert Live Broadcast Viewers into Sales!
Chatbots

Quick Link

  • About
  • Contact
  • Privacy
Follow US

© 2008-23 SmartData Collective. All Rights Reserved.

Removed from reading list

Undo
Go to mobile version
Welcome Back!

Sign in to your account

Lost your password?