You May Not Be as Anonymous as You Think

Do you prefer watching Wheel of Fortune or Jeopardy? For me, it depends on what kind of mood I’m in. If I’m in need of an ego boost, I go for Wheel of Fortune. But if I want to be challenged or I need a bite of Humble Pie, I tune into Jeopardy.

Ironically, I was reminded of these game shows last week at the IAPP Global Privacy Summit 2015. During two of the sessions I attended, the speakers pointed out that when it comes to Personally Identifiable Information (PII) – i.e., the data that can be used to uniquely identify an individual – there’s still some misunderstanding about anonymized data.

Case in point: An organization’s privacy policy may state that your anonymized data may be shared with third parties. The mistake we tend to make is that we assume that since our data is anonymized, our identities can’t be revealed. But that’s not always the case.

About anonymous and pseudonymous data. I’m going to use the infamous Wheel of Fortune (WOF) board to help demonstrate this:

If you’re a WOF fan, you may remember this particular board from one of last year’s shows. Before I blow your mind with what happened, let’s talk about anonymous and pseudonymous data.

Anonymous is when there really isn’t a way to identify an individual with the data/information provided. In other words, the data can’t be linked back to a particular individual. If we look at our WOF board, each new round in the game reveals the board in an anonymous state, namely:

The white tiles are blank – i.e., no letters have been revealed
The category – e.g., person, place, thing, etc. – has not been announced
No letters have been called out by the contestants
You have no idea what the phrase is
Vanna is smiling at you

If we were to remove all the letters in the photo above (NE, THING, RSTLNE, and HMDO), this would be an example of anonymous data. We cannot yet identify the phrase the white blank tiles are hiding.

Pseudonymous, on the other hand, is when an individual’s identity is not known, but can be made known through association with similar or related data. Here’s our WOF board in a pseudonymous state:

Some white letters are revealed (“N” and “E” in the photo above)
The category is known (e.g., “THING”)
Contestants have called out correct and incorrect letters
As more letters are called out, the easier it becomes to guess the phrase
Vanna is still smiling at you

In the game show represented in the photo above, Emil, the contestant, was in the final round. All letters had been called and played – R, S, T, L, N, E, H, M, D and O – and he had 30 seconds to figure out what 3-word phrase began with the letters “NE.” And yes, he did it! With very little data (letters) to go on, Emil was able to figure out the true identity of the phrase.

Why this matters. What may be classified as anonymous may, in fact, be pseudonymous. With pseudonymity, an ID or number of some sort is typically used to tie back to an individual – without revealing any PII, such as the person’s name.

For example, let’s say you buy some raffle tickets, and the number on one of your tickets is called. You can now use your raffle ticket to pick up your prize – without ever revealing your name and identity. That’s how pseudonymity works: It softly hides the identity of a person, yet with a little elbow grease, the identity may be figured out.

Recent developments. As I discussed in my last post, one anonymized or pseudonymized dataset is not the issue. It’s when multiple datasets get pulled together and once-deidentified individuals have now become reidentifiable. Without their knowledge. Without their understanding of how this combined data may be used to help (or harm) them.

Last week’s Global Privacy Summit came on the heels of the White House’s announcement about the draft Consumer Privacy Bill of Rights Act – legislation designed to protect consumers’ privacy in this day and age. During one session, the sentiment – even from the FTC – was that even though the legislation is a step in the right direction, it falls short on many fronts for a first draft – especially given that it was first announced back in 2012. There are still too many loopholes for companies and not enough control for consumers. God. Bless. America.

One final thought. The Global Privacy Summit brought in over 3,000 privacy professionals from around the globe – from governments, law firms, technology companies, and more. It was my first time attending an IAPP event as a member, and I loved being with folks who are passionate about privacy, and are working full throttle to tackle the data privacy and security challenges this internet/big data age has ushered in. We’re making good progress, but we have a long way to go.

Let’s keep the faith. It’s a big data world out there. Now let’s be safe.

P.S. Did you figure out the WOF puzzle above? Here’s the answer.