arXiv:2106.09685 Abstract | arXiv Analytics

arXiv:2106.09685 [cs.CL]Abstract References Reviews Resources

LoRA: Low-Rank Adaptation of Large Language Models

Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Weizhu Chen

Published 2021-06-17Version 1

The dominant paradigm of natural language processing consists of large-scale pre-training on general domain data and adaptation to particular tasks or domains. As we pre-train larger models, conventional fine-tuning, which retrains all model parameters, becomes less feasible. Using GPT-3 175B as an example, deploying many independent instances of fine-tuned models, each with 175B parameters, is extremely expensive. We propose Low-Rank Adaptation, or LoRA, which freezes the pre-trained model weights and injects trainable rank decomposition matrices into each layer of the Transformer architecture, greatly reducing the number of trainable parameters for downstream tasks. For GPT-3, LoRA can reduce the number of trainable parameters by 10,000 times and the computation hardware requirement by 3 times compared to full fine-tuning. LoRA performs on-par or better than fine-tuning in model quality on both GPT-3 and GPT-2, despite having fewer trainable parameters, a higher training throughput, and no additional inference latency. We also provide an empirical investigation into rank-deficiency in language model adaptations, which sheds light on the efficacy of LoRA. We release our implementation in GPT-2 at https://github.com/microsoft/LoRA .

Categories: cs.CL, cs.AI, cs.LG

Keywords: large language models, low-rank adaptation, trainable parameters, injects trainable rank decomposition matrices, natural language processing consists

Related articles: Most relevant | Search more

arXiv:2205.11916 [cs.CL] (Published 2022-05-24)

Large Language Models are Zero-Shot Reasoners

Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, Yusuke Iwasawa

arXiv:2207.14382 [cs.CL] (Published 2022-07-28)

Large Language Models and the Reverse Turing Test

Terrence Sejnowski

arXiv:2211.05110 [cs.CL] (Published 2022-11-09)

Large Language Models with Controllable Working Memory

Daliang Li et al.

arXiv Analytics

arXiv:2106.09685 [cs.CL]Abstract References Reviews Resources

LoRA: Low-Rank Adaptation of Large Language Models

Links

Toolbox

arXiv:2106.09685 [cs.CL]AbstractReferencesReviewsResources

LoRA: Low-Rank Adaptation of Large Language Models

Links

Toolbox

arXiv:2106.09685 [cs.CL]Abstract References Reviews Resources