arXiv:1912.02783 Abstract | arXiv Analytics

arXiv:1912.02783 [cs.CV]Abstract References Reviews Resources

Self-Supervised Learning of Video-Induced Visual Invariances

Michael Tschannen, Josip Djolonga, Marvin Ritter, Aravindh Mahendran, Neil Houlsby, Sylvain Gelly, Mario Lucic

Published 2019-12-05Version 1

We propose a general framework for self-supervised learning of transferable visual representations based on video-induced visual invariances (VIVI). We consider the implicit hierarchy present in the videos and make use of (i) frame-level invariances (e.g. stability to color and contrast perturbations), (ii) shot/clip-level invariances (e.g. robustness to changes in object orientation and lighting conditions), and (iii) video-level invariances (semantic relationships of scenes across shots/clips), to define a holistic self-supervised loss. Training models using different variants of the proposed framework on videos from the YouTube-8M data set, we obtain state-of-the-art self-supervised transfer learning results on the 19 diverse downstream tasks of the Visual Task Adaptation Benchmark (VTAB), using only 1000 labels per task. We then show how to co-train our models jointly with labeled images, outperforming an ImageNet-pretrained ResNet-50 by 0.8 points with 10x fewer labeled images, as well as the previous best supervised model by 3.7 points using the full ImageNet data set.

Categories: cs.CV, cs.LG

Keywords: video-induced visual invariances, self-supervised learning, visual task adaptation benchmark, full imagenet data set, diverse downstream tasks

Related articles: Most relevant | Search more

arXiv:2010.14713 [cs.CV] (Published 2020-10-28)

CompRess: Self-Supervised Learning by Compressing Representations

Soroush Abbasi Koohpayegani, Ajinkya Tejankar, Hamed Pirsiavash

arXiv:1711.06379 [cs.CV] (Published 2017-11-17)

Improvements to context based self-supervised learning

T. Nathan Mundhenk, Daniel Ho, Barry Y. Chen

arXiv:2007.16189 [cs.CV] (Published 2020-07-31)

Self-supervised learning through the eyes of a child

A. Emin Orhan, Vaibhav V. Gupta, Brenden M. Lake