arXiv:1611.10258 Abstract | arXiv Analytics

arXiv:1611.10258 [cs.LG]Abstract References Reviews Resources

Reliably Learning the ReLU in Polynomial Time

Surbhi Goel, Varun Kanade, Adam Klivans, Justin Thaler

Published 2016-11-30Version 1

We give the first dimension-efficient algorithms for learning Rectified Linear Units (ReLUs), which are functions of the form $\mathbf{x} \mapsto \max(0, \mathbf{w} \cdot \mathbf{x})$ with $\mathbf{w} \in \mathbb{S}^{n-1}$. Our algorithm works in the challenging Reliable Agnostic learning model of Kalai, Kanade, and Mansour (2009) where the learner is given access to a distribution $\cal{D}$ on labeled examples but the labeling may be arbitrary. We construct a hypothesis that simultaneously minimizes the false-positive rate and the loss on inputs given positive labels by $\cal{D}$, for any convex, bounded, and Lipschitz loss function. The algorithm runs in polynomial-time (in $n$) with respect to any distribution on $\mathbb{S}^{n-1}$ (the unit sphere in $n$ dimensions) and for any error parameter $\epsilon = \Omega(1/\log n)$ (this yields a PTAS for a question raised by F. Bach on the complexity of maximizing ReLUs). These results are in contrast to known efficient algorithms for reliably learning linear threshold functions, where $\epsilon$ must be $\Omega(1)$ and strong assumptions are required on the marginal distribution. We can compose our results to obtain the first set of efficient algorithms for learning constant-depth networks of ReLUs. Our techniques combine kernel methods and polynomial approximations with a "dual-loss" approach to convex programming. As a byproduct we obtain a number of applications including the first set of efficient algorithms for "convex piecewise-linear fitting" and the first efficient algorithms for noisy polynomial reconstruction of low-weight polynomials on the unit sphere.

Categories: cs.LG, stat.ML

Keywords: polynomial time, reliable agnostic learning model, unit sphere, first set, reliably learning linear threshold functions

Related articles: Most relevant | Search more

arXiv:1207.1366 [cs.LG] (Published 2012-07-04)

Learning Factor Graphs in Polynomial Time & Sample Complexity

Pieter Abbeel, Daphne Koller, Andrew Y. Ng

arXiv:1906.07437 [cs.LG] (Published 2019-06-18)

Inverting Deep Generative models, One layer at a time

Qi Lei, Ajil Jalal, Inderjit S. Dhillon, Alexandros G. Dimakis

arXiv:1803.04087 [cs.LG] (Published 2018-03-12)

Learning Binary Bayesian Networks in Polynomial Time and Sample Complexity

Adarsh Barik, Jean Honorio