arXiv:1212.4777 Abstract | arXiv Analytics

arXiv:1212.4777 [cs.LG]Abstract References Reviews Resources

A Practical Algorithm for Topic Modeling with Provable Guarantees

Sanjeev Arora, Rong Ge, Yoni Halpern, David Mimno, Ankur Moitra, David Sontag, Yichen Wu, Michael Zhu

Published 2012-12-19Version 1

Topic models provide a useful method for dimensionality reduction and exploratory data analysis in large text corpora. Most approaches to topic model inference have been based on a maximum likelihood objective. Efficient algorithms exist that approximate this objective, but they have no provable guarantees. Recently, algorithms have been introduced that provide provable bounds, but these algorithms are not practical because they are inefficient and not robust to violations of model assumptions. In this paper we present an algorithm for topic model inference that is both provable and practical. The algorithm produces results comparable to the best MCMC implementations while running orders of magnitude faster.

Comments: 26 pages

Categories: cs.LG, cs.DS, stat.ML

Keywords: provable guarantees, practical algorithm, topic model inference, large text corpora, best mcmc implementations

Related articles: Most relevant | Search more

arXiv:2105.03692 [cs.LG] (Published 2021-05-08)

Provable Guarantees against Data Poisoning Using Self-Expansion and Compatibility

Charles Jin, Melinda Sun, Martin Rinard

arXiv:1812.03825 [cs.LG] (Published 2018-12-07)

Asynchronous Training of Word Embeddings for Large Text Corpora

Avishek Anand, Megha Khosla, Jaspreet Singh, Jan-Hendrik Zab, Zijian Zhang

arXiv:1902.10644 [cs.LG] (Published 2019-02-27)

Provable Guarantees for Gradient-Based Meta-Learning