arXiv Analytics

Sign in

arXiv:2402.16617 [cs.CL]AbstractReferencesReviewsResources

Long-Context Language Modeling with Parallel Context Encoding

Howard Yen, Tianyu Gao, Danqi Chen

Published 2024-02-26, updated 2024-06-11Version 2

Extending large language models (LLMs) to process longer inputs is crucial for a wide range of applications. However, the substantial computational cost of transformers and limited generalization of positional encoding restrict the size of their context window. We introduce Context Expansion with Parallel Encoding (CEPE), a framework that can be applied to any existing decoder-only LLMs to extend their context window. CEPE employs a small encoder to process long inputs chunk by chunk, enabling the frozen decoder to utilize additional contexts via cross-attention. CEPE is efficient, generalizable, and versatile: trained with 8K-token documents, it extends the context window of LLAMA-2 to 128K tokens, offering 10x the throughput with only 1/6 of the memory. CEPE yields strong performance on language modeling and in-context learning. CEPE also excels in retrieval-augmented applications, while existing long-context models degenerate with retrieved contexts. We further introduce a CEPE variant that can extend the context window of instruction-tuned models using only unlabeled data, and showcase its effectiveness on LLAMA-2-CHAT, leading to a strong instruction-following model that can leverage very long contexts on downstream tasks.

Comments: ACL 2024. Code, models, and data are available at https://github.com/princeton-nlp/CEPE. arXiv admin note: text overlap with arXiv:1912.01214 by other authors
Categories: cs.CL
Related articles: Most relevant | Search more
arXiv:2410.23771 [cs.CL] (Published 2024-10-31)
What is Wrong with Perplexity for Long-context Language Modeling?
Lizhe Fang et al.
arXiv:2410.01651 [cs.CL] (Published 2024-10-02, updated 2025-01-27)
Efficient Length-Generalizable Attention via Causal Retrieval for Long-Context Language Modeling
arXiv:2311.09136 [cs.CL] (Published 2023-11-15)
RRescue: Ranking LLM Responses to Enhance Reasoning Over Context