Big Data, Small Details: How Metadata Creates Security Risks

What happens when you put a photograph online? In most cases, not much happens at all. It simply exists in cyberspace – permanently – with the trillions of other user-generated images. And the fact is that, besides there being an overwhelming number of photos on the web, most of them don’t contain any valuable information. What information there is, though, is hidden in the metadata.

Contents

What’s In The Metadata?A Growing Risk Identifying Trouble Points

With the rise of big data, that metadata is suddenly becoming valuable. If a picture is worth a thousand words, the metadata is infinitely more so. And now it could pose a security risk.

What’s In The Metadata?

There are many kinds of information tucked into your photos’ metadata. One of the primary pieces of information, though, is a GPS position as found in the Exif file. Whenever you take a picture with your phone, it embeds your location data into the file. If you upload that photo immediately – or regularly upload from a private location such as your home – then you reveal your whereabouts. It’s why actress Emma Watson doesn’t take photos with fans anymore; the security risk is too great.

Photo metadata typically becomes more complex over time as users modify them. There are fields for titles, photographer name, copyright information, and more. The more you do with a photo before you upload it, the more information that goes with it.

A Growing Risk

New technologies like facial recognition software have only increased the dangers associated with photo metadata – and the risks of photo mining and app breaches more generally. We’ve seen this happen before with shifts in all kinds of technology. User technology takes a step forward, but hackers and data analysts move even more quickly. 110 million people were affected by Target’s 2014 security breach, and we can expect worse. The growth of big data comes with bigger security risks, but with everything it’s capable of, we’re compelled to keep moving forward. Our role, then, is to identify the greatest risks and minimize them.

Identifying Trouble Points

The easiest way to identify image-based security risks is to ask where we use, manipulate, and store images. The average person, for example, might upload their images to Facebook and Instagram, but also to printing sites to make family photo albums or create holiday cards. All of these platforms have the potential to be easy targets for data theft, but for the most part, they’re interested in metadata for use in targeted marketing.

Another place people commonly upload their photos to is cloud storage, such as Google Drive, Box, and or Microsoft OneDrive. We want to believe these systems are secure, and for the most part they are. But each has its own agenda. Box, for example, recently partnered with Google Cloud Vision to apply image recognition to photo uploads. Their aim is to streamline workflows by extracting relevant business data across in-company user uploads. But what’s to stop them from analyzing your private images? What would they learn about your family and friends?

Data analysis gets smarter every day. Facebook has a patent on technology that identifies people who may know each other by analyzing photo metadata and comparing dust and scratches on the camera lens. People who upload pictures with similar file names and matching lens patterns may be connected socially – or maybe they’ve never met or were at the same public event. Facebook may still try to connect them.

The worst-case scenario in metadata mining involves stalking and harassment – and it’s already happened. Want to find someone at home? Look at the Exif data on their personal photos and you can be outside their house or job in no time. The fact is, unless upload platforms automatically erase Exif and other data, or users begin to aggressively eliminate it before posting their images, then that information is out there. And you can’t erase lens dust or scratch patterns. Data miners will always find a way to learn more about you. The best we can do is erase what we have access to. That’s security first.