arXiv Analytics

Sign in

arXiv:1908.09574 [cs.LG]AbstractReferencesReviewsResources

Improvability Through Semi-Supervised Learning: A Survey of Theoretical Results

Alexander Mey, Marco Loog

Published 2019-08-26Version 1

Semi-supervised learning is a setting in which one has labeled and unlabeled data available. In this survey we explore different types of theoretical results when one uses unlabeled data in classification and regression tasks. Most methods that use unlabeled data rely on certain assumptions about the data distribution. When those assumptions are not met in reality, including unlabeled data may actually decrease performance. Studying such methods, it therefore is particularly important to have an understanding of the underlying theory. In this review we gather results about the possible gains one can achieve when using semi-supervised learning as well as results about the limits of such methods. More precisely, this review collects the answers to the following questions: What are, in terms of improving supervised methods, the limits of semi-supervised learning? What are the assumptions of different methods? What can we achieve if the assumptions are true? Finally, we also discuss the biggest bottleneck of semi-supervised learning, namely the assumptions they make.

Related articles: Most relevant | Search more
arXiv:1906.10343 [cs.LG] (Published 2019-06-25)
Semi-Supervised Learning with Self-Supervised Networks
arXiv:2007.01293 [cs.LG] (Published 2020-07-02)
Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning
arXiv:1905.02249 [cs.LG] (Published 2019-05-06)
MixMatch: A Holistic Approach to Semi-Supervised Learning