arXiv Analytics

Sign in

arXiv:1702.05659 [cs.LG]AbstractReferencesReviewsResources

On Loss Functions for Deep Neural Networks in Classification

Katarzyna Janocha, Wojciech Marian Czarnecki

Published 2017-02-18Version 1

Deep neural networks are currently among the most commonly used classifiers. Despite easily achieving very good performance, one of the best selling points of these models is their modular design - one can conveniently adapt their architecture to specific needs, change connectivity patterns, attach specialised layers, experiment with a large amount of activation functions, normalisation schemes and many others. While one can find impressively wide spread of various configurations of almost every aspect of the deep nets, one element is, in authors' opinion, underrepresented - while solving classification problems, vast majority of papers and applications simply use log loss. In this paper we try to investigate how particular choices of loss functions affect deep models and their learning dynamics, as well as resulting classifiers robustness to various effects. We perform experiments on classical datasets, as well as provide some additional, theoretical insights into the problem. In particular we show that L1 and L2 losses are, quite surprisingly, justified classification objectives for deep nets, by providing probabilistic interpretation in terms of expected misclassification. We also introduce two losses which are not typically used as deep nets objectives and show that they are viable alternatives to the existing ones.

Comments: Presented at Theoretical Foundations of Machine Learning 2017 (TFML 2017)
Categories: cs.LG
Related articles: Most relevant | Search more
arXiv:1909.06677 [cs.LG] (Published 2019-09-14)
Predictive Multiplicity in Classification
arXiv:1703.08816 [cs.LG] (Published 2017-03-26)
Uncertainty Quantification in the Classification of High Dimensional Data
arXiv:1705.09055 [cs.LG] (Published 2017-05-25)
The cost of fairness in classification