arXiv:2105.03692 Abstract | arXiv Analytics

arXiv:2105.03692 [cs.LG]Abstract References Reviews Resources

Provable Guarantees against Data Poisoning Using Self-Expansion and Compatibility

Published 2021-05-08Version 1

A recent line of work has shown that deep networks are highly susceptible to backdoor data poisoning attacks. Specifically, by injecting a small amount of malicious data into the training distribution, an adversary gains the ability to control the model's behavior during inference. In this work, we propose an iterative training procedure for removing poisoned data from the training set. Our approach consists of two steps. We first train an ensemble of weak learners to automatically discover distinct subpopulations in the training set. We then leverage a boosting framework to recover the clean data. Empirically, our method successfully defends against several state-of-the-art backdoor attacks, including both clean and dirty label attacks. We also present results from an independent third-party evaluation including a recent \textit{adaptive} poisoning adversary. The results indicate our approach is competitive with existing defenses against backdoor attacks on deep neural networks, and significantly outperforms the state-of-the-art in several scenarios.

Categories: cs.LG, cs.CR, stat.ML

Keywords: provable guarantees, self-expansion, compatibility, training set, deep neural networks

Related articles: Most relevant | Search more

arXiv:1709.02802 [cs.LG] (Published 2017-09-08)

Towards Proving the Adversarial Robustness of Deep Neural Networks

Guy Katz, Clark Barrett, David L. Dill, Kyle Julian, Mykel J. Kochenderfer

arXiv:1611.05162 [cs.LG] (Published 2016-11-16)

Net-Trim: A Layer-wise Convex Pruning of Deep Neural Networks

Alireza Aghasi, Nam Nguyen, Justin Romberg

arXiv:1710.10570 [cs.LG] (Published 2017-10-29)

Weight Initialization of Deep Neural Networks(DNNs) using Data Statistics

Saiprasad Koturwar, Shabbir Merchant