MarkTechPost@AI 2024年12月24日
Why Do Task Vectors Exist in Pretrained LLMs? This AI Research from MIT and Improbable AI Uncovers How Transformers Form Internal Abstractions and the Mechanisms Behind in-Context Learning (ICL)
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

大型语言模型(LLMs)在形成抽象概念和适应新情境方面展现出与人类认知过程惊人的相似性。研究表明,LLMs通过上下文学习(ICL)展现出类似人类心智模型的内部抽象形成能力,无需参数更新即可适应复杂任务。研究人员提出了概念编码-解码机制来解释LLMs如何发展内部抽象,发现模型将潜在概念映射到不同的表示空间,并学习应用概念特定的算法。实验验证了这种机制在不同预训练模型中的有效性,并揭示了早期层在概念编码中的关键作用。此研究加深了我们对LLMs内部工作机制的理解,并解释了其在不同任务中表现差异的原因。

🧠大型语言模型(LLMs)展现出通过上下文学习(ICL)形成内部抽象的能力,无需参数更新即可适应新任务,这与人类通过基本概念理解复杂经验的方式相似。

🧮研究人员提出概念编码-解码机制,解释了LLMs如何发展内部抽象。该机制包含两个阶段:概念编码,即将潜在概念映射到不同的表示空间;概念解码,即学习应用概念特定的算法。这个过程是相互依赖的。

📊实验验证了概念编码-解码机制在不同预训练模型中的有效性,并使用概念可解码性(CD)指标量化了潜在概念从表示中被推断出的程度。更高的CD分数与更好的任务表现相关,表明概念编码的清晰度直接影响模型表现。

🔬早期层在概念编码中起关键作用,对早期层的修改比对后期层的修改能带来更大的性能提升。此外,预训练中频繁出现的概念(如名词和基本算术运算)在表示空间中具有更清晰的分离。

Large Language Models (LLMs) have demonstrated remarkable similarities to human cognitive processes’ ability to form abstractions and adapt to new situations. Just as humans have historically made sense of complex experiences through fundamental concepts like physics and mathematics, autoregressive transformers now show comparable capabilities through in-context learning (ICL). Recent research has highlighted how these models can adapt to tricky tasks without parameter updates, suggesting the formation of internal abstractions similar to human mental models. Studies have begun exploring the mechanistic aspects of how pretrained LLMs represent latent concepts as vectors in their representations. However, questions remain about the underlying reasons for these task vectors’ existence and their varying effectiveness across different tasks.

Researchers have proposed several theoretical frameworks to understand the mechanisms behind in-context learning in LLMs. One significant approach views ICL through a Bayesian framework, suggesting a two-stage algorithm that estimates posterior probability and likelihood. Parallel to this, studies have identified task-specific vectors in LLMs that can trigger desired ICL behaviors. At the same time, other research has revealed how these models encode concepts like truthfulness, time, and space as linearly separable representations. Through mechanistic interpretability techniques such as causal mediation analysis and activation patching, researchers have begun to uncover how these concepts emerge in LLM representations and influence downstream ICL task performance, demonstrating that transformers implement different algorithms based on inferred concepts.

Researchers from the Massachusetts Institute of Technology and  Improbable AI introduce the concept encoding-decoding mechanism, providing a compelling explanation for how transformers develop internal abstractions. Research on a small transformer trained on sparse linear regression tasks reveals that concept encoding emerges as the model learns to map different latent concepts into distinct, separable representation spaces. This process operates in tandem with the development of concept-specific ICL algorithms through concept decoding. Testing across various pretrained model families, including Llama-3.1 and Gemma-2 in different sizes, demonstrates that larger language models exhibit this concept encoding-decoding behavior when processing natural ICL tasks. The research introduces Concept Decodability as a geometric measure of internal abstraction formation, showing that earlier layers encode latent concepts while latter layers condition algorithms on these inferred concepts, with both processes developing interdependently.

The theoretical framework for understanding in-context learning draws heavily from a Bayesian perspective, which proposes that transformers implicitly infer latent variables from demonstrations before generating answers. This process operates in two distinct stages: latent concept inference and selective algorithm application. Experimental evidence from synthetic tasks, particularly using sparse linear regression, demonstrates how this mechanism emerges during model training. When trained on multiple tasks with different underlying bases, models develop distinct representational spaces for different concepts while simultaneously learning to apply concept-specific algorithms. The research reveals that concepts sharing overlaps or correlations tend to share representational subspaces, suggesting potential limitations in how models distinguish between related tasks in natural language processing.

The research provides compelling empirical validation of the concept encoding-decoding mechanism in pretrained Large Language Models across different families and scales, including Llama-3.1 and Gemma-2. Through experiments with part-of-speech tagging and bitwise arithmetic tasks, researchers demonstrated that models develop more distinct representational spaces for different concepts as the number of in-context examples increases. The study introduces Concept Decodability (CD) as a metric to quantify how well latent concepts can be inferred from representations, showing that higher CD scores correlate strongly with better task performance. Notably, concepts frequently encountered during pretraining, such as nouns and basic arithmetic operations, show clearer separation in representational space compared to more complex concepts. The research further demonstrates through finetuning experiments that early layers play a crucial role in concept encoding, with modifications to these layers yielding significantly better performance improvements than changes to later layers.

The concept encoding-decoding mechanism provides valuable insights into several key questions about Large Language Models’ behavior and capabilities. The research addresses the varying success rates of LLMs across different in-context learning tasks, suggesting that performance bottlenecks can occur at both the concept inference and algorithm decoding stages. Models show stronger performance with concepts frequently encountered during pretraining, such as basic logical operators, but may struggle even with known algorithms if concept distinction remains unclear. The mechanism also explains why explicit modeling of latent variables doesn’t necessarily outperform implicit learning in transformers, as standard transformers naturally develop effective concept encoding capabilities. Also, this framework offers a theoretical foundation for understanding activation-based interventions in LLMs, suggesting that such methods work by directly influencing the encoded representations that guide the model’s generation process.


Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. Don’t Forget to join our 60k+ ML SubReddit.

Trending: LG AI Research Releases EXAONE 3.5: Three Open-Source Bilingual Frontier AI-level Models Delivering Unmatched Instruction Following and Long Context Understanding for Global Leadership in Generative AI Excellence….

The post Why Do Task Vectors Exist in Pretrained LLMs? This AI Research from MIT and Improbable AI Uncovers How Transformers Form Internal Abstractions and the Mechanisms Behind in-Context Learning (ICL) appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

大型语言模型 上下文学习 概念编码-解码 内部抽象 深度学习
相关文章