arXiv:2002.01256 Abstract | arXiv Analytics

arXiv:2002.01256 [cs.LG]Abstract References Reviews Resources

Minimax Defense against Gradient-based Adversarial Attacks

Published 2020-02-04Version 1

State-of-the-art adversarial attacks are aimed at neural network classifiers. By default, neural networks use gradient descent to minimize their loss function. The gradient of a classifier's loss function is used by gradient-based adversarial attacks to generate adversarially perturbed images. We pose the question whether another type of optimization could give neural network classifiers an edge. Here, we introduce a novel approach that uses minimax optimization to foil gradient-based adversarial attacks. Our minimax classifier is the discriminator of a generative adversarial network (GAN) that plays a minimax game with the GAN generator. In addition, our GAN generator projects all points onto a manifold that is different from the original manifold since the original manifold might be the cause of adversarial attacks. To measure the performance of our minimax defense, we use adversarial attacks - Carlini Wagner (CW), DeepFool, Fast Gradient Sign Method (FGSM) - on three datasets: MNIST, CIFAR-10 and German Traffic Sign (TRAFFIC). Against CW attacks, our minimax defense achieves 98.07% (MNIST-default 98.93%), 73.90% (CIFAR-10-default 83.14%) and 94.54% (TRAFFIC-default 96.97%). Against DeepFool attacks, our minimax defense achieves 98.87% (MNIST), 76.61% (CIFAR-10) and 94.57% (TRAFFIC). Against FGSM attacks, we achieve 97.01% (MNIST), 76.79% (CIFAR-10) and 81.41% (TRAFFIC). Our Minimax adversarial approach presents a significant shift in defense strategy for neural network classifiers.

Categories: cs.LG, cs.GT, stat.ML

Keywords: gradient-based adversarial attacks, neural network classifiers, minimax defense achieves, loss function, original manifold

Related articles: Most relevant | Search more

arXiv:2006.01456 [cs.LG] (Published 2020-06-02)

Perturbation Analysis of Gradient-based Adversarial Attacks

Utku Ozbulak, Manvel Gasparyan, Wesley De Neve, Arnout Van Messem

arXiv:1901.09178 [cs.LG] (Published 2019-01-26)

A general model for plane-based clustering with loss function

Zhen Wang, Yuan-Hai Shao, Lan Bai, Chun-Na Li, Li-Ming Liu

arXiv:2003.04173 [cs.LG] (Published 2020-03-09)

Gradient-based adversarial attacks on categorical sequence models via traversing an embedded world

Ivan Fursov, Alexey Zaytsev, Nikita Kluchnikov, Andrey Kravchenko, Evgeny Burnaev