arXiv:1702.05659 Abstract | arXiv Analytics

arXiv:1702.05659 [cs.LG]Abstract References Reviews Resources

On Loss Functions for Deep Neural Networks in Classification

Katarzyna Janocha, Wojciech Marian Czarnecki

Published 2017-02-18Version 1

Deep neural networks are currently among the most commonly used classifiers. Despite easily achieving very good performance, one of the best selling points of these models is their modular design - one can conveniently adapt their architecture to specific needs, change connectivity patterns, attach specialised layers, experiment with a large amount of activation functions, normalisation schemes and many others. While one can find impressively wide spread of various configurations of almost every aspect of the deep nets, one element is, in authors' opinion, underrepresented - while solving classification problems, vast majority of papers and applications simply use log loss. In this paper we try to investigate how particular choices of loss functions affect deep models and their learning dynamics, as well as resulting classifiers robustness to various effects. We perform experiments on classical datasets, as well as provide some additional, theoretical insights into the problem. In particular we show that L1 and L2 losses are, quite surprisingly, justified classification objectives for deep nets, by providing probabilistic interpretation in terms of expected misclassification. We also introduce two losses which are not typically used as deep nets objectives and show that they are viable alternatives to the existing ones.

Comments: Presented at Theoretical Foundations of Machine Learning 2017 (TFML 2017)

Categories: cs.LG

Keywords: deep neural networks, classification, loss functions affect deep models, change connectivity patterns, deep nets objectives

Related articles: Most relevant | Search more

arXiv:1909.06677 [cs.LG] (Published 2019-09-14)

Predictive Multiplicity in Classification

Charles T. Marx, Flavio du Pin Calmon, Berk Ustun

arXiv:1703.08816 [cs.LG] (Published 2017-03-26)

Uncertainty Quantification in the Classification of High Dimensional Data

Andrea L. Bertozzi, Xiyang Luo, Andrew M. Stuart, Konstantinos C. Zygalakis

arXiv:1705.09055 [cs.LG] (Published 2017-05-25)

The cost of fairness in classification