arXiv:1906.00695 Abstract | arXiv Analytics

arXiv:1906.00695 [cs.LG]Abstract References Reviews Resources

Continual learning with hypernetworks

Johannes von Oswald, Christian Henning, João Sacramento, Benjamin F. Grewe

Published 2019-06-03Version 1

Artificial neural networks suffer from catastrophic forgetting when they are sequentially trained on multiple tasks. To overcome this problem, we present a novel approach based on task-conditioned hypernetworks, i.e., networks that generate the weights of a target model based on task identity. Continual learning (CL) is less difficult for this class of models thanks to a simple key observation: instead of relying on recalling the input-output relations of all previously seen data, task-conditioned hypernetworks only require rehearsing previous weight realizations, which can be maintained in memory using a simple regularizer. Besides achieving good performance on standard CL benchmarks, additional experiments on long task sequences reveal that task-conditioned hypernetworks display an unprecedented capacity to retain previous memories. Notably, such long memory lifetimes are achieved in a compressive regime, when the number of trainable weights is comparable or smaller than target network size. We provide insight into the structure of low-dimensional task embedding spaces (the input space of the hypernetwork) and show that task-conditioned hypernetworks demonstrate transfer learning properties. Finally, forward information transfer is further supported by empirical results on a challenging CL benchmark based on the CIFAR-10/100 image datasets.

Categories: cs.LG, cs.AI, stat.ML

Subjects: 68T99

Keywords: continual learning, long task sequences reveal, hypernetworks demonstrate transfer learning properties, artificial neural networks suffer, standard cl benchmarks

Related articles: Most relevant | Search more

arXiv:1811.01146 [cs.LG] (Published 2018-11-03)

Closed-Loop GAN for continual Learning

Amanda Rios, Laurent Itti

arXiv:1811.11682 [cs.LG] (Published 2018-11-28)

Experience Replay for Continual Learning

David Rolnick, Arun Ahuja, Jonathan Schwarz, Timothy P. Lillicrap, Greg Wayne

arXiv:1904.07734 [cs.LG] (Published 2019-04-15)

Three scenarios for continual learning