ReFace: Real-time Adversarial Attacks on Facial Recognition Systems

Conference TBD


Deep neural network based face recognition models have been shown to be vulnerable to adversarial examples. However, many of the past attacks require the adversary to solve an input-dependent optimization problem using gradient descent which makes the attack impractical in real-time. These adversarial examples are also tightly coupled to the attacked model and aren't as successful in transferring to different models. In this work, we propose a real-time, highly-transferable attack on face recognition models based on Adversarial Transformation Networks (ATNs). We find that the white-box attack success rate of a pure U-Net ATN falls substantially short of gradient-based attacks like PGD on large face recognition datasets. We therefore propose a new architecture for ATNs that closes this gap while maintaining a 10000 times speedup over PGD. Furthermore, we find that at a given perturbation magnitude, our ATN adversarial perturbations are more effective in transferring to new facial recognition models than PGD. ReFace attacks can successfully deceive commercial face recognition services in a transfer attack setting and reduce face identification accuracy from 82% to 16.4% for AWS SearchFaces API and Azure face verification accuracy from 91% to 50.1%.


ReFace Attack Demo - Real time attack on webcam stream


Attack Goals

We design a perturbation generator that operates in real-time and finds a quasi-imperceptible adversarial perturbation for an input image, which when added to the input causes mis-prediction of the embedding vector thereby degrading the verification and identification performance of the face recognition model. When attacking a face verification system, we adversarially perturb one of the two probe images. In this attack setting, our goal is to reduce the true recall rate of the verification system (performance on positive pairs). When attacking a face identification system, we assume the probe images have been adversarially perturbed while the dataset of gallery images is benign. In this attack setting, our goal is to lower the recognition rate of the face identification system.