arXiv:1703.08816 Abstract | arXiv Analytics

arXiv:1703.08816 [cs.LG]Abstract References Reviews Resources

Uncertainty Quantification in the Classification of High Dimensional Data

Andrea L. Bertozzi, Xiyang Luo, Andrew M. Stuart, Konstantinos C. Zygalakis

Published 2017-03-26Version 1

Classification of high dimensional data finds wide-ranging applications. In many of these applications equipping the resulting classification with a measure of uncertainty may be as important as the classification itself. In this paper we introduce, develop algorithms for, and investigate the properties of, a variety of Bayesian models for the task of binary classification; via the posterior distribution on the classification labels, these methods automatically give measures of uncertainty. The methods are all based around the graph formulation of semi-supervised learning. We provide a unified framework which brings together a variety of methods which have been introduced in different communities within the mathematical sciences. We study probit classification, generalize the level-set method for Bayesian inverse problems to the classification setting, and generalize the Ginzburg-Landau optimization-based classifier to a Bayesian setting; we also show that the probit and level set approaches are natural relaxations of the harmonic function approach. We introduce efficient numerical methods, suited to large data-sets, for both MCMC-based sampling as well as gradient-based MAP estimation. Through numerical experiments we study classification accuracy and uncertainty quantification for our models; these experiments showcase a suite of datasets commonly used to evaluate graph-based semi-supervised learning algorithms.

Comments: 33 pages, 14 figures

Categories: cs.LG, stat.ML

Keywords: classification, uncertainty quantification, graph-based semi-supervised learning algorithms, high dimensional data finds, dimensional data finds wide-ranging applications

Related articles: Most relevant | Search more

arXiv:1708.08591 [cs.LG] (Published 2017-08-29)

EC3: Combining Clustering and Classification for Ensemble Learning

Tanmoy Chakraborty

arXiv:1909.06677 [cs.LG] (Published 2019-09-14)

Predictive Multiplicity in Classification

Charles T. Marx, Flavio du Pin Calmon, Berk Ustun

arXiv:1902.00045 [cs.LG] (Published 2019-01-31)

Gaussian Conditional Random Fields for Classification