AI Can Recognize Your Face Even If It’s Pixelated
Pixelation has long been a familiar fig leaf to cover our visual media’s most private parts. Blurred chunks of text or obscured faces and license plates show up on the news, in redacted documents, and online. The technique is nothing fancy, but it has worked well enough, because people can’t see or read through the distortion. The problem, however, is that humans aren’t the only image recognition masters around anymore. As computer vision becomes increasingly robust, it’s starting to see things we can’t.
Researchers at the University of Texas at Austin and Cornell Tech say that they’ve trained a piece of software that can undermine the privacy benefits of standard content-masking techniques like blurring and pixelation by learning to read or see what’s meant to be hidden in images—anything from a blurred house number to a pixelated human face in the background of a photo. And they didn’t even need to painstakingly develop extensive new image uncloaking methodologies to do it. Instead, the team found that mainstream machine learning methods—the process of “training” a computer with a set of example data rather than programming it—lend themselves readily to this type of attack.
“The techniques we’re using in this paper are very standard in image recognition, which is a disturbing thought,” says Vitaly Shmatikov, one of the authors from Cornell Tech. Since the machine learning methods employed in the research are widely known—to the point that there are tutorials and training manuals online—Shmatikov says it would be possible for a bad actor with a baseline of technical knowledge to carry out these types of attacks. Additionally, more powerful object and facial recognition techniques already exist that could potentially go even further in defeating methods of visual redaction.
Images from each of the four datasets. The leftmost image is the original, while the next four columns show increasingly intense pixelation, and the last three columns show three levels of masking using P3. The more extensive the obfuscation, the lower the machine learning software’s rates of success at identifying the underlying image. But for most of the researchers’ tests, they still identified the obfuscated text or face in more than 50 percent of cases.
The researchers were able to defeat three privacy protection technologies, starting with YouTube’s proprietary blur tool. YouTube allows uploaders to select objects or figures that they want to blur, but the team used their attack to identify obfuscated faces in videos. In another example of their method, the researchers attacked pixelation (also called mosaicing). To generate different levels of pixelation, they used their own implementation of a standard mosaicing technique that the researchers say is found in Photoshop and other commons programs. And finally, they attacked a tool called Privacy Preserving Photo Sharing (P3), which encrypts identifying data in JPEG photos so humans can’t see the overall image, while leaving other data components in the clear so computers can still do things with the files like compress them.
To execute the attacks, the team trained neural networks to perform image recognition by feeding them data from four large and well-known image sets for analysis. The more words, faces, or objects a neural network “sees,” the better it gets at spotting those targets. Once the neural networks achieved roughly 90 percent accuracy or better on identifying relevant objects in the training sets, the researchers obfuscated the images using the three privacy tools and then further trained their neural networks to interpret blurred and pixelated images based on knowledge of the originals.
Finally, they used obfuscated test images that the neural networks hadn’t yet been exposed to in any form to see whether the image recognition could identify faces, objects, and handwritten numbers. For some data sets and masking techniques, the neural network success rates exceeded 80 percent and even 90 percent. In the case of mosaicing, the more intensely pixelated images were, the lower the success rates got. But their de-obfuscating machine learning software was often still in the 50 percent to 75 percent range. The lowest success rate was 17 percent on a data set of celebrity faces obfuscated with the P3 redaction system. If the computers had been randomly guessing to identify the faces, shapes, and numbers, however, the researchers calculated that the success rates for each test set would have been at most 10 percent and as low as a fifth of a percent, meaning that even relatively low identification success rates were still far better than guessing.
Even if the group’s machine learning method couldn’t always penetrate the effects of redaction on an image, it still represents a serious blow to pixelation and blurring as a privacy tool, says Lawrence Saul, a machine learning researcher at University of California, San Diego. “For the purposes of defeating privacy, you don’t really need to show that 99.9 percent of the time you can reconstruct” an image or string of text, says Saul. “If 40 or 50 percent of the time you can guess the face or figure out what the text is then that’s enough to render that privacy method as something that should be obsolete.”
It’s worth noting that the research isn’t doing image reconstruction from scratch, and can’t reverse the obfuscation to actually recreate pictures of the faces or objects it’s identifying. The technique can only find what it knows to look for—not necessarily an exact image, but things it’s seen before, like a certain object or a previously identified person’s face. For example, in hours of CCTV footage from a train station with every passerby’s face blurred, it wouldn’t be able to identify every individual. But if you suspected that a particular person had walked by at a particular time, it could spot that person’s face among the crowd even in an obfuscated video. Saul notes that an additional challenge would have been to test the neural networks on obfuscated images collected from a broader array of real-world situations and conditions, instead of only testing on more standardized images from existing data sets. But based on their current findings, he argues that more practical application would likely be possible.
The researchers’ larger goal is to warn the privacy and security communities that advances in machine learning as a tool for identification and data collection can’t be ignored. There are ways to defend against these types of attacks, as Saul points out, like using black boxes that offer total coverage instead of image distortions that leave traces of the content behind. Better yet is to cut out any random image of a face and use it to cover the target face before blurring, so that even if the obfuscation is defeated, the identity of the person underneath still isn’t exposed. “I hope the result of this paper will be that nobody will be able to publish a privacy technology and claim that it’s secure without going through this kind of analysis,” Shmatikov says. Putting an awkward black blob over someone’s face in a video may be less standard today than pixelating it out. But it may soon be a necessary step to keep vision far more penetrating than ours from piercing those pixels.