arXiv Analytics

Sign in

arXiv:1207.6365 [cs.DS]AbstractReferencesReviewsResources

Low Rank Approximation and Regression in Input Sparsity Time

Kenneth L. Clarkson, David P. Woodruff

Published 2012-07-26, updated 2013-04-05Version 4

We design a new distribution over $\poly(r \eps^{-1}) \times n$ matrices $S$ so that for any fixed $n \times d$ matrix $A$ of rank $r$, with probability at least 9/10, $\norm{SAx}_2 = (1 \pm \eps)\norm{Ax}_2$ simultaneously for all $x \in \mathbb{R}^d$. Such a matrix $S$ is called a \emph{subspace embedding}. Furthermore, $SA$ can be computed in $\nnz(A) + \poly(d \eps^{-1})$ time, where $\nnz(A)$ is the number of non-zero entries of $A$. This improves over all previous subspace embeddings, which required at least $\Omega(nd \log d)$ time to achieve this property. We call our matrices $S$ \emph{sparse embedding matrices}. Using our sparse embedding matrices, we obtain the fastest known algorithms for $(1+\eps)$-approximation for overconstrained least-squares regression, low-rank approximation, approximating all leverage scores, and $\ell_p$-regression. The leading order term in the time complexity of our algorithms is $O(\nnz(A))$ or $O(\nnz(A)\log n)$. We optimize the low-order $\poly(d/\eps)$ terms in our running times (or for rank-$k$ approximation, the $n*\poly(k/eps)$ term), and show various tradeoffs. For instance, we also use our methods to design new preconditioners that improve the dependence on $\eps$ in least squares regression to $\log 1/\eps$. Finally, we provide preliminary experimental results which suggest that our algorithms are competitive in practice.

Comments: Included optimization of subspace embedding dimension from (d/eps)^4 to O~(d/eps)^2 in Section 4, by more careful analysis of perfect hashing, and minor improvements to regression / low rank approximation because of it
Categories: cs.DS
Related articles: Most relevant | Search more
arXiv:cs/0206033 [cs.DS] (Published 2002-06-24)
Algorithms for Media
arXiv:cs/0606116 [cs.DS] (Published 2006-06-28)
New Algorithms for Regular Expression Matching
arXiv:1601.02939 [cs.DS] (Published 2016-01-05)
The minimal hitting set generation problem: algorithms and computation