Facebook’s Big Data: Equal Parts Exciting and Terrifying?



Facebook, the popular social network with over 1.2 billion users world wide, has not just big, but gigantic amounts of data at their disposal, making it a big data paradise.


Facebook, the popular social network with over 1.2 billion users world wide, has not just big, but gigantic amounts of data at their disposal, making it a big data paradise.

We as the users of Facebook happily feed their big data beast. We send 10 billion Facebook messages per day, click the ‘like’ button 4.5 billion times and upload 350 million new pictures each and every day. Overall, there are 17 billion location-tagged posts and a staggering 250 billion photos on Facebook.

All this information means, Facebook knows what we look like, who our friends are, what our views are on most things, when our birthday is, whether we are in a relationship or not, the location we are at, what we like and dislike, and much more. This is an awful lot of information (and power) in the hands of one commercial company.

As someone that helps companies get to grips with big data, I am in awe of the big data gold mine Facebook is creating. I believe that even if we all stopped using Facebook today, the company would have enough detailed insights about us to exploit that for years. No other company in history has ever possessed this level of detailed personal information and I believe that, apart from Google maybe, there is no other company on the planet that comes close to those levels of ‘intemate’ big data.

Of course, Facebook is acutely aware of this and their entire business model is based on the effective exploitation of their big data. The more we use Facebook, the more they will learn about us and the more valuable the information will become. Facebook is investing heavily in their ability to collect, store and analyze all the data we provide, but their hunger for data doesn’t stop there.

Facebook goes beyond simply analyzing and ‘mining’ the user profile data. USA Today revealed how Facebook tracks users across the Web. Using ‘tracking cookies’ Facebook can collect information about each website you are visiting. This means when you are logged into Facebook and then browse the web (completely separately from your Facebook activities) Facebook knows what sites you are visiting.

Facebook has also invested in image processing and ‘face recognition‘ capabilities, that basically allow Facebook to track you – because it knows what you and your friends look like from the photos you have shared. It can now search the Internet and all other Facebook profiles to find pictures of you and your friends.

Face recognition allows Facebook to make ‘tag suggestions‘ for people on photos you have uploaded but it is mind boggling what else they could do with technology like that. Just imagine how Facebook could use computer algorithms to track your body shape. They could analyze your latest beach shots you have shared and compare them with older ones to detect that you have put on some weight. It could then sell this information to a slimming club in your area who can place an ad on your Facebook page. Scary?

There is more: a recent study shows that it is possible to accurately predict a range of highly sensitive personal attributes simply by analyzing the ‘Likes’ you have clicked on Facebook. The work conducted by researchers at Cambridge University and Microsoft Research shows how the patterns of Facebook ‘Likes’ can very accurately predict your sexual orientation, satisfaction with life, intelligence, emotional stability, religion, alcohol use and drug use, relationship status, age, gender, race and political views among many others. Interestingly, those “revealing” ‘Likes’ can have little or nothing to do with the actual attributes they help to predict and often a single ‘Like’ is enough to generate an accurate prediction.

I have one big concern about the way Facebook uses our data: I feel it is not done in a truly transparent way. Their excuse has always been: It is all stated in the small print. But how many of us really read the pages and pages of small print before we sign up? And do we re-read everything each time Facebook up-dates their privacy policy? The answer is NO.

It looks as if most Facebook users agree with me, but two of them feel that Facebook has gone too far by systematically scanning the content of private messages. As revealed by the FT recently, Facebook has been hit with a class-action lawsuit. Users Matthew Campbell from Arkansas and Michael Hurley from Oregon have filed a  lawsuit on behalf of over 166m Facebook users in the US. The accusation is that Facebook is violating the Electronic Communications Privacy Act by scanning and exploiting the content of private messages sent via the Facebook platform without prior consent by users.

The issue here is that ‘private’ messages are seen by most users as exactly that: private! The accusation is that Facebook identifies website links (URLs) contained in private messages and then searches these websites in order to profile users. In their accusation Campbell and Hurley argue: “Representing to users that the content of Facebook messages is ‘private’ creates an especially profitable opportunity for Facebook, because users who believe they are communicating on a service free from surveillance are likely to reveal facts about themselves that they would not reveal had they known the content was being monitored.”

A Facebook spokesperson told Bloomberg that the allegations are without merit and that Facebook will defend itself vigorously. Of course they would say that. The trouble for Facebook is to strike the right balance between offering a customer service in form of a free social networking platform and shareholder returns, especially profits from selling data and advertising based on their big data insights.

To me, it feels like Facebook (as well as many other companies including Google, Yahoo! etc.) are trying to somehow hide the extent to which they are analysing and mining our data. I feel that we need more transparency and maybe some control over the way our data can and cannot be used. I believe that improved transparency will help rebuild the tarnished reputation of big data analytics caused by the NSA revelations.

But what do you think? Does it scare you that Facebook knows everything about you and could exploit and sell that information? Does this make Facebook too powerful? Please share your views…


Check out my other posts in The Big Data Guru column and feel free to connect with me via TwitterLinkedInFacebook and The Advanced Performance Institute.

Bernard Marr is a best-selling author, keynote speaker, strategic performance consultant and analytics, KPI and Big Data guru.