arXiv:2006.10562 Abstract | arXiv Analytics

arXiv:2006.10562 [cs.LG]Abstract References Reviews Resources

Uncertainty in Gradient Boosting via Ensembles

Aleksei Ustimenko, Liudmila Prokhorenkova, Andrey Malinin

Published 2020-06-18Version 1

Gradient boosting is a powerful machine learning technique that is particularly successful for tasks containing heterogeneous features and noisy data. While gradient boosting classification models return a distribution over class labels, regressions models typically yield only point predictions. However, for many practical, high-risk applications, it is also important to be able to quantify uncertainty in the predictions to avoid costly mistakes. In this work, we examine a probabilistic ensemble-based framework for deriving uncertainty estimates in the predictions of gradient boosting classification and regression models. Crucially, the proposed approach allows the total uncertainty to be decomposed into \textit{data uncertainty}, which comes from the complexity and noise in data distribution, and \textit{knowledge uncertainty}, coming from the lack of information about a given region of the feature space. Two approaches for generating ensembles are considered: Stochastic Gradient Boosting (SGB) and Stochastic Gradient Langevin Boosting (SGLB). Notably, SGLB also enables the generation of a \emph{virtual} ensemble via only one gradient boosting model, which significantly reduces complexity. Experiments on a range of regression and classification datasets show that ensembles of gradient boosting models yield improved predictive performance, and measures of uncertainty successfully enable detection of out-of-domain inputs.

Categories: cs.LG, stat.ML

Keywords: uncertainty, gradient boosting classification models return, predictions, stochastic gradient langevin, regressions models typically yield

Related articles: Most relevant | Search more

arXiv:2010.03753 [cs.LG] (Published 2020-10-08)

Uncertainty in Neural Processes

Saeid Naderiparizi, Kenny Chiu, Benjamin Bloem-Reddy, Frank Wood

arXiv:1810.06530 [cs.LG] (Published 2018-10-15)

Successor Uncertainties: exploration and uncertainty in temporal difference learning

David Janz, Jiri Hron, José Miguel Hernández-Lobato, Katja Hofmann, Sebastian Tschiatschek

arXiv:2405.14066 [cs.LG] (Published 2024-05-22)

Online Classification with Predictions

Vinod Raman, Ambuj Tewari