arXiv:2310.00313 Abstract | arXiv Analytics

arXiv:2310.00313 [cs.CL]Abstract References Reviews Resources

In-Context Learning in Large Language Models: A Neuroscience-inspired Analysis of Representations

Safoora Yousefi, Leo Betthauser, Hosein Hasanbeig, Akanksha Saran, Raphaël Millière, Ida Momennejad

Published 2023-09-30Version 1

Large language models (LLMs) exhibit remarkable performance improvement through in-context learning (ICL) by leveraging task-specific examples in the input. However, the mechanisms behind this improvement remain elusive. In this work, we investigate how LLM embeddings and attention representations change following in-context-learning, and how these changes mediate improvement in behavior. We employ neuroscience-inspired techniques such as representational similarity analysis (RSA) and propose novel methods for parameterized probing and measuring ratio of attention to relevant vs. irrelevant information in Llama-2 70B and Vicuna 13B. We designed three tasks with a priori relationships among their conditions: reading comprehension, linear regression, and adversarial prompt injection. We formed hypotheses about expected similarities in task representations to investigate latent changes in embeddings and attention. Our analyses revealed a meaningful correlation between changes in both embeddings and attention representations with improvements in behavioral performance after ICL. This empirical framework empowers a nuanced understanding of how latent representations affect LLM behavior with and without ICL, offering valuable tools and insights for future research and practical applications.

Categories: cs.CL

Keywords: large language models, in-context learning, neuroscience-inspired analysis, latent representations affect llm behavior, attention representations

Related articles: Most relevant | Search more

arXiv:2301.00234 [cs.CL] (Published 2022-12-31)

A Survey for In-context Learning

Qingxiu Dong et al.

arXiv:2305.17256 [cs.CL] (Published 2023-05-26)

Large Language Models Can be Lazy Learners: Analyze Shortcuts in In-Context Learning

Ruixiang Tang, Dehan Kong, Longtao Huang, Hui Xue

arXiv:2402.10189 [cs.CL] (Published 2024-02-15, updated 2024-03-28)

Uncertainty Quantification for In-Context Learning of Large Language Models

Chen Ling et al.

arXiv Analytics

arXiv:2310.00313 [cs.CL]Abstract References Reviews Resources

In-Context Learning in Large Language Models: A Neuroscience-inspired Analysis of Representations

Links

Toolbox

arXiv:2310.00313 [cs.CL]AbstractReferencesReviewsResources

In-Context Learning in Large Language Models: A Neuroscience-inspired Analysis of Representations

Links

Toolbox

arXiv:2310.00313 [cs.CL]Abstract References Reviews Resources