Identity and Privacy: Early Indications, June 2011

With the soft launch of Google Plus, it’s an opportune time to think
about digital privacy, insofar as Google is explicitly targeting
widespread user dissatisfaction with Facebook’s treatment of their
personal information. The tagging feature, for example, that was used
to build a massive (hundreds of millions of users) facial recognition
database has important privacy implications, for example. In standard
Facebook fashion, it’s turned on by default, and opting out once may
not guarantee that a user is excluded from the next wave of changes.

According to a 2010 poll developed at the University of Michigan and
administered by the American Customer Satisfaction Index, Facebook
scored in the bottom 5%, in the range of cable operators, airlines,
and the IRS. Even as Facebook is rumored to be holding off user-base
announcements for now-mundane 100-million intervals, users are
defecting. While the service is said to be closing in on 750 million
users globally, reports of 1% of that population in the U.S. and
Canada defecting in one month were not confirmed by the company, but
neither were they denied. A Google search on “Facebook fatigue”
returned 23 million hits. At the same time, Facebook delivers 31% of the 1.1 trillion ads served in the U.S. each quarter (Yahoo is a distant second at 10% share); those ads are expected to represent $4 billion in 2011 revenue.

With the Facebook IPO still impending, the questions about privacy
take on more urgency. What, really, is privacy? It’s clearly a
fundamental concept, typically conceived of as a human or civil right.
According to the Oxford English Dictionary, privacy is “the state or
condition of being alone, undisturbed, or free from public attention,
as a matter of choice or right; seclusion; freedom from interference
or intrusion.” It’s an old word, dating to the 14th century, that is
constantly being reinvented as times change.

Being left alone in a digital world is a difficult concept, however.
Here, NYU’s Helen Nissenbaum is helpful: “What people care most about
is not simply restricting the flow of information but ensuring that it
flows appropriately. . . .” Thus she does not wade further into the
definitional swamp, but spends a book’s worth of analysis* on the issue
of how people interact with the structures that collect, parse, and
move their information. (*Privacy in Context: Technology, Privacy, and the Integrity of Social Life)

Through this lens, the following artifacts are not able to be judged
as public or private, good or bad, acceptable or unacceptable, but
they can be discussed and considered in the context of people’s
values, choices, and autonomy: when I use X, is my information handled
in a way that I consent to in some reasonably informed way? The
digital privacy landscape is vast, including some familiar tools, and
for all the privacy notices I have received, there is a lot I don’t
know about the workings of most of these:

-loyalty card programs
-Google streetview
-toll-pass RFID tags
-surveillance cameras
-TSA no-fly lists
-Facebook data and actions
-credit-rating data
-Amazon browsing and purchase history
-Google search history
-Foursquare check-ins
-digital camera metadata
-expressed preferences such as star ratings, Facebook Likes, or eBay
seller feedback
-searchable digital public records such as court dates, house
purchases, or bankruptcy
-cell phone location and connection records
-medical records, electronic or paper
-Gmail correspondence
-TSA backscatter X-ray

Does such lack of knowledge mean that I have conceded privacy, or that
I am exposing aspects of my life I would rather not? Probably both.
In addition, the perfection of digital memory — handled properly,
bits don’t degrade with repeated copying — means that what these
entities know, they know for a very long time. The combination,
therefore, of lack of popular understanding of the mechanics of
personal information and the permanence of that information makes
privacy doubly suspect.

Scale

Given the climate of the past ten years in relation to privacy, the
events of 9/11 have conditioned the debate to an extraordinary degree.
The U.S. government was reorganized, search and seizure rules were
broadened, and rules of the game got more complicated: not only were
certain entities ordered to turn over information related to their
customers, they were obligated to deny that they had done so. More
centrally, the FBI’s well-documented failure to “connect the dots”
spurred a reorganization of multiple information silos into a vast and
possibly suboptimally sprawling Department of Homeland Security.

Governments have always wanted more information than people typically
wanted to give them. Given the new legal climate along with
improvements in the technologies of databases, information retrieval,
and image processing, for example, more is known about U.S.
individuals than at any time heretofore. (Whether it is known by the
proper people and agencies is a separate question.) At the 2000 Super
Bowl, for example, the entire crowd was scanned and matched against an
image database. Note the rhetoric employed even before the terrorist
attack on the twin towers and the Pentagon:

“[Tampa detective Bill] Todd is excited about the biometric
crimestopper aid: The facial recognition technology is an extremely
fast, technologically advanced version of placing a cop on a corner,
giving him a face book of criminals and saying, Pick the criminals out
of the crowd and detain them. It’s just very fast and accurate.”

Note that the category of “criminals” can be conveniently defined: the
definition in Yemen, Libya, or Pakistan might be debatable, depending
on one’s perspective. In Tampa, civil liberties were not explicitly
addressed, nor was there judicial oversight:

“Concerned first and foremost with public safety, the Tampa police
used its judgment in viewing the images brought up on the monitor. Although the cameras permitted the police to view crimes captured by the cameras and apprehend suspects for pick-pocketing and other petty
crimes, their real goal was to ensure crowd safety. The Tampa Police were involved in forming the database and determining by threat level who was added to the database.” (emphasis added)

Letting a police force, which in any given locality may have
corruption issues as in large areas of Mexico, use digital records to
figuratively stand on a corner and pick “the criminals out of the
crowd” without probable cause is scary stuff. Also in this week’s
news, a major story concerns an FBI agent who protected his informant
from murder charges. And police officers might not be corrupt:
Mexican drug gangs are now being said to threaten U.S. law enforcement
officers with harm. Once the information and the technology exist,
they will be abused: the issue is how to design safeguards to the
process.

Consider RFID toll passes. According to a transportation industry
trade journal,

“The first case of electronic toll record tracking may have been in
September 1997, when the New York City Police Department used E-Z Pass toll records to track the movements of a car owned by New Jersey millionaire Nelson G. Gross who had been abducted and murdered. The police did not use a subpoena to obtain these records but asked the Metropolitan Transportation Authority and they complied.”

Again, the potential for privacy abuse emerged before protections did.
I could find no statistics for the number of EZpass and similar
tokens in current use, but it could well be in the tens of millions.
As only one in a number of highly revealing artifacts attached to a
person’s digital identity, toll tokens join a growing number of
sensors of which few people are aware. The OBD system in a car,
expanded from a mechanic’s engine diagnostic, has become a “black box”
like those recovered from airplane crashes. Progressive Insurance is
experimenting with data logging from the devices as a premium-setting
tool, which does not, significantly, include GPS information; the firm
discontinued a GPS-based experiment in 2000.

Invisibility

In its excellent “What They Know” investigative series in 2010, the
Wall Street Journal concluded that “they” know a lot. Numbers only
scratch the surface of the issues:

-Dictionary.com installed 234 tracking cookies in a single visit.
WSJ.com itself came in below average, at 60. Wikipedia.org was the
only site of 50 tested to install zero tracking software files.

-When Microsoft relaunched Internet Explorer in 2008, corporate
interests concerned about ad revenue vetoed a plan to make privacy
settings persistent. Thus users have to reset the privacy preferences
with every browser restart, and few people are aware of the settings
console in the first place.

-The Facebook Like button connects a behavior (an online vote, a
pursuit of a coupon, or an act of whim) to a flesh-and-blood person:
the Facebook profile’s presumably real name, real age, real sex, and
real location. Again according to the Journal,

“For example, Facebook or Twitter know when one of their members reads
an article about filing for bankruptcy on MSNBC.com or goes to a blog
about depression called Fighting the Darkness, even if the user
doesn’t click the “Like” or “Tweet” buttons on those sites.

For this to work, a person only needs to have logged into Facebook or
Twitter once in the past month. The sites will continue to collect
browsing data, even if the person closes their browser or turns off
their computers, until that person explicitly logs out of their
Facebook or Twitter accounts, the study found.”

-Few people realize how technologies can be used to follow them from
one realm to another. The giant advertising firm WPP recently
launched Xaxis, which, according to the Wall Street Journal (in a
story separate from its “What They Know” series), “will manage what it describes as the ‘world’s largest’ database of profiles of individuals that includes demographic, financial, purchase, geographic and other information collected from their Web activities and brick-and-mortar transactions. The database will be used to personalize ads consumers see on the Web, social-networking sites, mobile phones and ultimately, the TV set.”

In each of these examples, it’s pretty clear that all of these
companies ignored, or at least lightly valued, Nissenbaum’s notion of
contextual integrity as it relates to the individual. Given the lack
of tangible consequences, it makes economic sense for them to do so.

Identity

Given that digital privacy seems almost to be a quaint notion in the
U.S. (European live and are legally protected differently), a deeper
question emerges: if that OED sense of freedom from intrusion is being
reshaped by our many digital identities, who are we and what do we
control? Ads, spam, nearly continuous interruption (if we let
ourselves listen), and an often creepy sense of “how did they know
that?” as LinkedIn, Amazon, Google, Facebook, and Netflix hone in our
most cherished idiosyncrasies — all of these are embedded in the
contemporary connected culture. Many sites such as Lifehacker
recommend frequent pruning: e-mail offers, coupon sites, Twitter
feeds, and Facebook friends can multiply out of control, and saying no
often requires more deliberation than joining up.

Who am I? Not to get metaphysical, but the context for that question
is in flux. My fifth-grade teacher was fond of saying “tell me who
your friends are and I’ll tell you who you are.” What would he say to
today’s fifth-grader, who may well text 8,000 times a month and have a
public Facebook page?

Does it matter that a person’s political alignment, sexual
orientation, religious affiliation, and zip code (a reasonable proxy
for household income) are now a matter of public, searchable record?
Is her identity different now that some many facets of it are
transparent? Or is it a matter of Mark Zuckerberg’s vision — people
have one identity, and transparency is good for relationships — being
implicitly shared more widely across the planet? Just today, a review
of Google Plus argued that people don’t mind having one big list of
“friends,” even as Facebook scored poorly in this year’s customer
satisfaction index.

Indeed, one solution to the privacy dilemma is to overshare: if
nothing can possibly be held close, secrets lose their potency,
perhaps. (For an example, see the story of Hasan Elahi and his
Trackingtransience website in the May 2007 Wired and in Albert-László
Barabási’s book Bursts.) The recent fascination with YouTube
pregnancy-test videos is fascinating: one of life’s most meaningful,
trajectory-altering moments is increasingly an occasion to show the
world the heavy (water) drinking, the trips to the pharmacy and the
toilet, and the little colored indicator, followed by the requisite
reaction shots. (For more, see Marisa Meltzer’s piece on Slate,
wonderfully titled “WombTube.”)

The other extreme, opting out, is difficult. Living without a mobile
phone, without electronic books, without MP3 music files, without
e-mail, and of course without Facebook or Google is difficult for many
to comprehend. In fact, the decision to unplug frequently goes
hand-in-hand with a book project, so unheard-of is the notion.

At the same time, the primacy of the word represented by these massive
information flows leaves out at least 10% of the adult U.S.
population: functional illiteracy, by its very nature, is difficult to
measure. One shocking statistic, presented without attribution by the
Detroit Literacy Coalition, pegs the number in that metro area at a
stunning 47%. Given a core population of about 4 million in the
3-county area, that’s well over 1 million adults who have few concerns
with Twitter feeds, Google searches, or allocating their 401(k)
portfolio.

In the middle, where most Americans now live, there’s an abundance of
grey area. As “what they know,” in the Journal’s words, grows and
what they can do with it expands, perhaps the erosion of analog
notions of privacy will be steady but substantial. Another
possibility is some high-profile, disproportionately captivating event
that galvanizes reaction. The fastest adoption of a technology in
modern times is not GPS, or DVD, or even Facebook: it was the U.S.
government’s Do Not Call registry. Engineering privacy into browsers,
cell phones, and very large data stores is unlikely; litigation is,
unfortunately, a more likely outcome. Just today a U.S. federal judge
refused to halt a class-action suit against Google’s practice of
using its Streetview cars for wi-fi sniffing. The story of privacy,
while old, is entering a fascinating, and exasperating, new phase, and
much remains to be learned, be tested, and be accepted as normal.