arXiv:2007.06993 Abstract | arXiv Analytics

arXiv:2007.06993 [cs.CR]Abstract References Reviews Resources

Adversarial Examples and Metrics

Nico Döttling, Kathrin Grosse, Michael Backes, Ian Molloy

Published 2020-07-14Version 1

Adversarial examples are a type of attack on machine learning (ML) systems which cause misclassification of inputs. Achieving robustness against adversarial examples is crucial to apply ML in the real world. While most prior work on adversarial examples is empirical, a recent line of work establishes fundamental limitations of robust classification based on cryptographic hardness. Most positive and negative results in this field however assume that there is a fixed target metric which constrains the adversary, and we argue that this is often an unrealistic assumption. In this work we study the limitations of robust classification if the target metric is uncertain. Concretely, we construct a classification problem, which admits robust classification by a small classifier if the target metric is known at the time the model is trained, but for which robust classification is impossible for small classifiers if the target metric is chosen after the fact. In the process, we explore a novel connection between hardness of robust classification an bounded storage model cryptography.

Comments: 25 pages, 1 figure, under submission

Categories: cs.CR, cs.LG

Keywords: adversarial examples, work establishes fundamental limitations, small classifier, admits robust classification, bounded storage model cryptography

Related articles: Most relevant | Search more

arXiv:2010.16204 [cs.CR] (Published 2020-10-30)

Capture the Bot: Using Adversarial Examples to Improve CAPTCHA Robustness to Bot Attacks

Dorjan Hitaj, Briland Hitaj, Sushil Jajodia, Luigi V. Mancini

arXiv:2405.20778 [cs.CR] (Published 2024-05-28)

Improved Generation of Adversarial Examples Against Safety-aligned LLMs

Qizhang Li, Yiwen Guo, Wangmeng Zuo, Hao Chen

arXiv:1909.08526 [cs.CR] (Published 2019-09-17)

Defending against Machine Learning based Inference Attacks via Adversarial Examples: Opportunities and Challenges