arXiv Analytics

Sign in

arXiv:1809.03207 [cs.LG]AbstractReferencesReviewsResources

Beyond the Selected Completely At Random Assumption for Learning from Positive and Unlabeled Data

Jessa Bekker, Jesse Davis

Published 2018-09-10Version 1

Most positive and unlabeled data is subject to selection biases. The labeled examples can, for example, be selected from the positive set because they are easier to obtain or more obviously positive. This paper investigates how learning can be enabled in this setting. We propose and theoretically analyze an empirical-risk-based method for incorporating the labeling mechanism. Additionally, we investigate under which assumptions learning is possible when the labeling mechanism is not fully understood and propose a practical method to enable this. Our empirical analysis supports the theoretical results and shows that taking into account the possibility of a selection bias, even when the labeling mechanism is unknown, improves the trained classifiers.

Related articles: Most relevant | Search more
arXiv:1808.08755 [cs.LG] (Published 2018-08-27)
Learning from Positive and Unlabeled Data under the Selected At Random Assumption
arXiv:1809.05710 [cs.LG] (Published 2018-09-15)
Alternate Estimation of a Classifier and the Class-Prior from Positive and Unlabeled Data
arXiv:1911.08696 [cs.LG] (Published 2019-11-20)
Where is the Bottleneck of Adversarial Learning with Unlabeled Data?