arXiv:1910.04867 Abstract | arXiv Analytics

arXiv:1910.04867 [cs.CV]Abstract References Reviews Resources

The Visual Task Adaptation Benchmark

Xiaohua Zhai, Joan Puigcerver, Alexander Kolesnikov, Pierre Ruyssen, Carlos Riquelme, Mario Lucic, Josip Djolonga, Andre Susano Pinto, Maxim Neumann, Alexey Dosovitskiy, Lucas Beyer, Olivier Bachem, Michael Tschannen, Marcin Michalski, Olivier Bousquet, Sylvain Gelly, Neil Houlsby

Published 2019-10-01Version 1

Representation learning promises to unlock deep learning for the long tail of vision tasks without expansive labelled datasets. Yet, the absence of a unified yardstick to evaluate general visual representations hinders progress. Many sub-fields promise representations, but each has different evaluation protocols that are either too constrained (linear classification), limited in scope (ImageNet, CIFAR, Pascal-VOC), or only loosely related to representation quality (generation). We present the Visual Task Adaptation Benchmark (VTAB): a diverse, realistic, and challenging benchmark to evaluate representations. VTAB embodies one principle: good representations adapt to unseen tasks with few examples. We run a large VTAB study of popular algorithms, answering questions like: How effective are ImageNet representation on non-standard datasets? Are generative models competitive? Is self-supervision useful if one already has labels?

Categories: cs.CV, cs.LG, stat.ML

Keywords: visual task adaptation benchmark, evaluate general visual representations hinders, general visual representations hinders progress

Related articles:

arXiv:1912.02783 [cs.CV] (Published 2019-12-05)

Self-Supervised Learning of Video-Induced Visual Invariances

Michael Tschannen, Josip Djolonga, Marvin Ritter, Aravindh Mahendran, Neil Houlsby, Sylvain Gelly, Mario Lucic

arXiv:1912.11370 [cs.CV] (Published 2019-12-24)

Large Scale Learning of General Visual Representations for Transfer

Alexander Kolesnikov, Lucas Beyer, Xiaohua Zhai, Joan Puigcerver, Jessica Yung, Sylvain Gelly, Neil Houlsby

arXiv:2010.02808 [cs.CV] (Published 2020-10-06)

Representation learning from videos in-the-wild: An object-centric approach

Rob Romijnders, Aravindh Mahendran, Michael Tschannen, Josip Djolonga, Marvin Ritter, Neil Houlsby, Mario Lucic