arXiv Analytics

Sign in

arXiv:1712.05440 [cs.LG]AbstractReferencesReviewsResources

Nonparametric Neural Networks

George Philipp, Jaime G. Carbonell

Published 2017-12-14Version 1

Automatically determining the optimal size of a neural network for a given task without prior information currently requires an expensive global search and training many networks from scratch. In this paper, we address the problem of automatically finding a good network size during a single training cycle. We introduce *nonparametric neural networks*, a non-probabilistic framework for conducting optimization over all possible network sizes and prove its soundness when network growth is limited via an L_p penalty. We train networks under this framework by continuously adding new units while eliminating redundant units via an L_2 penalty. We employ a novel optimization algorithm, which we term *adaptive radial-angular gradient descent* or *AdaRad*, and obtain promising results.

Related articles:
arXiv:1301.4083 [cs.LG] (Published 2013-01-17, updated 2013-07-13)
Knowledge Matters: Importance of Prior Information for Optimization
arXiv:1703.02629 [cs.LG] (Published 2017-03-07)
Online Learning Without Prior Information