arXiv:2401.14707 Abstract | arXiv Analytics

arXiv:2401.14707 [cs.CV]Abstract References Reviews Resources

Mitigating Feature Gap for Adversarial Robustness by Feature Disentanglement

Nuoyan Zhou, Dawei Zhou, Decheng Liu, Xinbo Gao, Nannan Wang

Published 2024-01-26Version 1

Deep neural networks are vulnerable to adversarial samples. Adversarial fine-tuning methods aim to enhance adversarial robustness through fine-tuning the naturally pre-trained model in an adversarial training manner. However, we identify that some latent features of adversarial samples are confused by adversarial perturbation and lead to an unexpectedly increasing gap between features in the last hidden layer of natural and adversarial samples. To address this issue, we propose a disentanglement-based approach to explicitly model and further remove the latent features that cause the feature gap. Specifically, we introduce a feature disentangler to separate out the latent features from the features of the adversarial samples, thereby boosting robustness by eliminating the latent features. Besides, we align features in the pre-trained model with features of adversarial samples in the fine-tuned model, to further benefit from the features from natural samples without confusion. Empirical evaluations on three benchmark datasets demonstrate that our approach surpasses existing adversarial fine-tuning methods and adversarial training baselines.

Comments: 8 pages, 6 figures

Categories: cs.CV, cs.AI, cs.LG

Keywords: mitigating feature gap, adversarial samples, feature disentanglement, latent features, existing adversarial fine-tuning methods

Related articles: Most relevant | Search more

arXiv:2211.01598 [cs.CV] (Published 2022-11-03)

Robust Few-shot Learning Without Using any Adversarial Samples

Gaurav Kumar Nayak, Ruchit Rawal, Inder Khatri, Anirban Chakraborty

arXiv:1803.06731 [cs.CV] (Published 2018-03-18)

Discriminative Learning of Latent Features for Zero-Shot Recognition

Yan Li, Junge Zhang, Jianguo Zhang, Kaiqi Huang

arXiv:1709.00672 [cs.CV] (Published 2017-09-03)

Unsupervised feature learning with discriminative encoder