arXiv Analytics

Sign in

arXiv:1805.06440 [stat.ML]AbstractReferencesReviewsResources

Regularization Learning Networks

Ira Shavitt, Eran Segal

Published 2018-05-16Version 1

Despite their impressive performance, Deep Neural Networks (DNNs) typically underperform Gradient Boosting Trees (GBTs) on many tabular-dataset learning tasks. We propose that applying a different regularization coefficient to each weight might boost the performance of DNNs by allowing them to make more use of the more relevant inputs. However, this will lead to an intractable number of hyperparameters. Here, we introduce Regularization Learning Networks (RLNs), which overcome this challenge by introducing an efficient hyperparameter tuning scheme that minimizes a new Counterfactual Loss. Our results show that RLNs significantly improve DNNs on tabular datasets, and achieve comparable results to GBTs, with the best performance achieved with an ensemble that combines GBTs and RLNs. RLNs produce extremely sparse networks, eliminating up to 99.8% of the network edges and 82% of the input features, thus providing more interpretable models and reveal the importance that the network assigns to different inputs. RLNs could efficiently learn a single network in datasets that comprise both tabular and unstructured data, such as in the setting of medical imaging accompanied by electronic health records.

Related articles: Most relevant | Search more
arXiv:1606.05340 [stat.ML] (Published 2016-06-16)
Exponential expressivity in deep neural networks through transient chaos
arXiv:1402.1869 [stat.ML] (Published 2014-02-08, updated 2014-06-07)
On the Number of Linear Regions of Deep Neural Networks
arXiv:1806.08734 [stat.ML] (Published 2018-06-22)
On the Spectral Bias of Deep Neural Networks