Defending against universal perturbations with shared adversarial training

C. Mummadi, Thomas Brox, J. Metzen
IEEE International Conference on Computer Vision (ICCV), 2019
Abstract: Classifiers such as deep neural networks have been shown to be vulnerable against adversarial perturbations on problems with high-dimensional input space. While adversarial training improves the robustness of image classifiers against such adversarial perturbations, it leaves them sensitive to perturbations on a non-negligible fraction of the inputs. In this work, we show that adversarial training is more effective in preventing universal perturbations, where the same perturbation needs to fool a classifier on many inputs. Moreover, we investigate the trade-off between robustness against universal perturbations and performance on unperturbed data and propose an extension of adversarial training that handles this trade-off more gracefully. We present results for image classification and semantic segmentation to showcase that universal perturbations that fool a model hardened with adversarial training become clearly perceptible and show patterns of the target scene.

Other associated files : mummadi_iccv2019.pdf [11.2MB]  

Images and movies


BibTex reference

  author       = "C. K. Mummadi and T. Brox and J. H. Metzen",
  title        = "Defending against universal perturbations with shared adversarial training",
  booktitle    = "IEEE International Conference on Computer Vision (ICCV)",
  month        = " ",
  year         = "2019",
  url          = "http://lmb.informatik.uni-freiburg.de/Publications/2019/Bro19b"

Other publications in the database