arXiv Analytics

Sign in

arXiv:1912.11370 [cs.CV]AbstractReferencesReviewsResources

Large Scale Learning of General Visual Representations for Transfer

Alexander Kolesnikov, Lucas Beyer, Xiaohua Zhai, Joan Puigcerver, Jessica Yung, Sylvain Gelly, Neil Houlsby

Published 2019-12-24Version 1

Transfer of pre-trained representations improves sample efficiency and simplifies hyperparameter tuning when training deep neural networks for vision. We revisit the paradigm of pre-training on large supervised datasets and fine-tuning the weights on the target task. We scale up pre-training, and create a simple recipe that we call Big Transfer (BiT). By combining a few carefully selected components, and transferring using a simple heuristic, we achieve strong performance on over 20 datasets. BiT performs well across a surprisingly wide range of data regimes - from 10 to 1M labeled examples. BiT achieves 87.8% top-1 accuracy on ILSVRC-2012, 99.3% on CIFAR-10, and 76.7% on the Visual Task Adaptation Benchmark (which includes 19 tasks). On small datasets, BiT attains 86.4% on ILSVRC-2012 with 25 examples per class, and 97.6% on CIFAR-10 with 10 examples per class. We conduct detailed analysis of the main components that lead to high transfer performance.

Related articles: Most relevant | Search more
arXiv:1910.04867 [cs.CV] (Published 2019-10-01)
The Visual Task Adaptation Benchmark
Xiaohua Zhai et al.
arXiv:2003.13502 [cs.CV] (Published 2020-03-30)
A Comparison of Data Augmentation Techniques in Training Deep Neural Networks for Satellite Image Classification
arXiv:1406.2080 [cs.CV] (Published 2014-06-09, updated 2015-04-10)
Training Convolutional Networks with Noisy Labels