arXiv Analytics

Sign in

arXiv:1905.02249 [cs.LG]AbstractReferencesReviewsResources

MixMatch: A Holistic Approach to Semi-Supervised Learning

David Berthelot, Nicholas Carlini, Ian Goodfellow, Nicolas Papernot, Avital Oliver, Colin Raffel

Published 2019-05-06Version 1

Semi-supervised learning has proven to be a powerful paradigm for leveraging unlabeled data to mitigate the reliance on large labeled datasets. In this work, we unify the current dominant approaches for semi-supervised learning to produce a new algorithm, MixMatch, that works by guessing low-entropy labels for data-augmented unlabeled examples and mixing labeled and unlabeled data using MixUp. We show that MixMatch obtains state-of-the-art results by a large margin across many datasets and labeled data amounts. For example, on CIFAR-10 with 250 labels, we reduce error rate by a factor of 4 (from 38% to 11%) and by a factor of 2 on STL-10. We also demonstrate how MixMatch can help achieve a dramatically better accuracy-privacy trade-off for differential privacy. Finally, we perform an ablation study to tease apart which components of MixMatch are most important for its success.

Related articles: Most relevant | Search more
arXiv:2007.01293 [cs.LG] (Published 2020-07-02)
Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning
arXiv:1906.10343 [cs.LG] (Published 2019-06-25)
Semi-Supervised Learning with Self-Supervised Networks
arXiv:1908.09574 [cs.LG] (Published 2019-08-26)
Improvability Through Semi-Supervised Learning: A Survey of Theoretical Results