arXiv:cs/0105016 Abstract | arXiv Analytics

arXiv:cs/0105016 [cs.CL]Abstract References Reviews Resources

Probabilistic top-down parsing and language modeling

Published 2001-05-08Version 1

This paper describes the functioning of a broad-coverage probabilistic top-down parser, and its application to the problem of language modeling for speech recognition. The paper first introduces key notions in language modeling and probabilistic parsing, and briefly reviews some previous approaches to using syntactic structure for language modeling. A lexicalized probabilistic top-down parser is then presented, which performs very well, in terms of both the accuracy of returned parses and the efficiency with which they are found, relative to the best broad-coverage statistical parsers. A new language model which utilizes probabilistic top-down parsing is then outlined, and empirical results show that it improves upon previous work in test corpus perplexity. Interpolation with a trigram model yields an exceptional improvement relative to the improvement observed by other models, demonstrating the degree to which the information captured by our parsing model is orthogonal to that captured by a trigram model. A small recognition experiment also demonstrates the utility of the model.

Comments: 28 pages, 6 tables, 8 figures. To appear in Computational Linguistics 27(2), June 2001

Categories: cs.CL

Subjects: I.2.7

Keywords: language model, probabilistic top-down parsing, broad-coverage probabilistic top-down parser, best broad-coverage statistical parsers, small recognition experiment

Related articles: Most relevant | Search more

arXiv:2201.08687 [cs.CL] (Published 2022-01-21)

A Comparative Study on Language Models for Task-Oriented Dialogue Systems

Vinsen Marselino Andreas, Genta Indra Winata, Ayu Purwarianti

arXiv:2011.06057 [cs.CL] (Published 2020-11-11)

Exploring the Value of Personalized Word Embeddings

Charles Welch, Jonathan K. Kummerfeld, Verónica Pérez-Rosas, Rada Mihalcea

arXiv:2203.11370 [cs.CL] (Published 2022-03-21)