arXiv Analytics

Sign in

arXiv:2005.02540 [cs.LG]AbstractReferencesReviewsResources

Proper measure for adversarial robustness

Hyeongji Kim, Ketil Malde

Published 2020-05-06Version 1

This paper analyzes the problems of standard adversarial accuracy and standard adversarial training method. We argue that standard adversarial accuracy fails to properly measure the robustness of classifiers. The definition allows overlaps in regions for clean samples and adversarial examples. Thus, there is a trade-off between accuracy and standard adversarial accuracy. Hence, using standard adversarial training can result in lowered accuracy. Also, standard adversarial accuracy can favor classifiers with more invariance-based adversarial examples, samples whose predicted classes are unchanged even if the perceptual classes are changed. In this paper, we introduce a new measure for the robustness of classifiers called genuine adversarial accuracy in order to handle the problems of the standard adversarial accuracy. It can measure adversarial robustness of classifiers without the trade-off between accuracy on clean data and adversarially perturbed samples. In addition, it doesn't favor a model with invariance-based adversarial examples. We show that a single nearest neighbor (1-NN) classifier is the most robust classifier according to genuine adversarial accuracy for given data and a metric when exclusive belongingness assumption is used. This result provides a fundamental step to train adversarially robust classifiers.

Comments: 13 pages. This paper supersedes the paper "Finding a human-like classifier". ( https://openreview.net/forum?id=BJeGFs9FsH )
Categories: cs.LG, cs.CR, stat.ML
Related articles: Most relevant | Search more
arXiv:2005.01452 [cs.LG] (Published 2020-05-04)
Do Gradient-based Explanations Tell Anything About Adversarial Robustness to Android Malware?
arXiv:1910.10679 [cs.LG] (Published 2019-10-23)
A Useful Taxonomy for Adversarial Robustness of Neural Networks
arXiv:1912.09855 [cs.LG] (Published 2019-12-20)
Explainability and Adversarial Robustness for RNNs