图解LLM，入门大模型必看

掘金人工智能 04月27日 18:22

图解LLM，入门大模型必看

本文通过9张图，概括性地介绍了大语言模型（LLM）的关键技术。内容涵盖了模型架构、微调方法、RAG技术、Agentic AI设计模式、文本切分策略以及Agentic AI系统的能力层级。从Transformer到MoE，从LoRA到Agentic RAG，再到KV Caching，文章旨在帮助读者快速了解LLM领域的前沿进展和核心概念。此外，还介绍了HyDE和Graph RAG等RAG技术的变体，以及Agentic AI的多种设计模式。

🧠 **Transformer vs. MoE**：Transformer使用固定的前馈网络，而Mixture of Experts (MoE) 通过Router动态选择部分专家网络，从而提升模型容量并减少计算量。

🛠️ **五种微调LLM的方法（LoRA系列）**：LoRA冻结原始参数，训练低秩矩阵；LoRA-FA在输入侧加入变换；VeRA使用更少参数，训练共享向量+偏置；Delta-LoRA每层引入多个LoRA分支；LoRA+在B矩阵上使用更大学习率。

🔄 **Traditional RAG vs. Agentic RAG**：传统RAG直接用query检索向量库，拼接上下文；Agentic RAG引入Agent，迭代重写问题、判断信息是否不足，流程更智能。

🔗 **Traditional RAG vs. Graph RAG**：传统RAG依赖向量库检索文档；Graph RAG用LLM生成知识图谱，结合图数据库进行图遍历，获取结构化上下文。

💡 **KV Caching in LLMs**：生成新token只需最后的hidden state，而该hidden state依赖最后一个query向量和之前的key/value向量。因此，缓存K/V向量可以避免重复计算，提升推理效率。

9张图解LLM

✅ 1. Transformer vs. Mixture of Experts

Transformer

Mixture of Experts (MoE)

✅ 2. 5种微调大语言模型（LLM）的方法（LoRA系列）

LoRA

LoRA-FA

VeRA

Delta-LoRA

LoRA+

✅ 3. Traditional RAG vs. Agentic RAG

传统RAG

Agentic RAG

✅ 4. 5种 Agentic AI 设计模式

Reflection

Tool Use

ReAct

Planning

Multi-agent

✅ 5. 5种 RAG 文本切分策略（Chunking）

Fixed-size

Semantic

Recursive

结构化切分

LLM生成切分

✅ 6. 5级 Agentic AI 系统能力层级

基础回复者

Router 模式

工具调用

多智能体

自主智能体

✅ 7. Traditional RAG vs. HyDE

RAG

HyDE

✅ 8. Traditional RAG vs. Graph RAG

RAG

Graph RAG

✅ 9. KV Caching in LLMs

Insight 1

Insight 2

结论

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

LLM Transformer RAG Agentic AI KV Caching

相关文章

Import AI 368: 500% faster local LLMs; 38X more efficient red teaming; AI21’s Frankenmodel

Import AI 364: Robot scaling laws; human-level LLM forecasting; and Claude 3

Learn AI Together — Towards AI Community Newsletter #23

This AI newsletter is all you need #98

Databricks Announces Major Updates to Its AI Suite to Boost AI Model Accuracy

Are Vector DBs the Future Data Platform for AI? with Ed Anuff - #664

Patterns and Middleware for LLM Applications with Kyle Roche - #659

Building LLM-Based Applications with Azure OpenAI with Jay Emery - #657

Mental Models for Advanced ChatGPT Prompting with Riley Goodside - #652

Trends in Computer Vision with Georgia Gkioxari - #549