PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models

cs.AI updates on arXiv.org 04月10日 12:03

PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models

本文介绍了一种名为PiSSA的参数高效微调（PEFT）方法，用于优化大型语言模型（LLMs）。PiSSA借鉴了LoRA的架构，但通过使用原始矩阵的主成分初始化适配器矩阵，并冻结剩余的残差部分，从而实现了比LoRA更快的收敛速度和更优的性能。实验结果表明，PiSSA在多种模型和任务上均优于LoRA，尤其是在GSM8K等基准测试中。此外，PiSSA还兼容量化技术，进一步降低了微调的内存需求，并且初始化速度快，易于应用。

💡 PiSSA是一种参数高效微调（PEFT）方法，旨在提升大型语言模型（LLMs）的微调效率。它基于LoRA架构，但进行了关键的改进。

✨ PiSSA的核心创新在于其初始化策略。它使用原始矩阵的主成分初始化适配器矩阵，并将剩余部分冻结，这使得PiSSA能够更快地收敛并提升性能。

🚀 实验结果表明，PiSSA在包括184M到70B参数的12个不同模型上，以及5个NLG和8个NLU任务中，均优于LoRA。例如，在GSM8K基准测试中，PiSSA将Mistral-7B的准确率提高了5.16%。

💡 PiSSA还兼容量化技术，如QPiSSA，这有助于减少微调的内存需求。QPiSSA在初始化阶段的量化误差更小，在LLaMA-3-70B上实现了86.05%的准确率，超过了QLoRA的81.73%。

⏱️ PiSSA的初始化速度很快，使用快速SVD技术仅需几秒钟，这使得从LoRA过渡到PiSSA的成本可以忽略不计。

arXiv:2404.02948v4 Announce Type: replace-cross Abstract: To parameter-efficiently fine-tune (PEFT) large language models (LLMs), the low-rank adaptation (LoRA) method approximates the model changes $\Delta W \in \mathbb{R}^{m \times n}$ through the product of two matrices $A \in \mathbb{R}^{m \times r}$ and $B \in \mathbb{R}^{r \times n}$, where $r \ll \min(m, n)$, $A$ is initialized with Gaussian noise, and $B$ with zeros. LoRA freezes the original model $W$ and updates the "Noise & Zero" adapter, which may lead to slow convergence. To overcome this limitation, we introduce Principal Singular values and Singular vectors Adaptation (PiSSA). PiSSA shares the same architecture as LoRA, but initializes the adaptor matrices $A$ and $B$ with the principal components of the original matrix $W$, and put the remaining components into a residual matrix $W^{res} \in \mathbb{R}^{m \times n}$ which is frozen during fine-tuning. Compared to LoRA, PiSSA updates the principal components while freezing the "residual" parts, allowing faster convergence and enhanced performance. Comparative experiments of PiSSA and LoRA across 12 different models, ranging from 184M to 70B, encompassing 5 NLG and 8 NLU tasks, reveal that PiSSA consistently outperforms LoRA under identical experimental setups. On the GSM8K benchmark, Mistral-7B fine-tuned with PiSSA achieves an accuracy of 72.86%, surpassing LoRA's 67.7% by 5.16%. Due to the same architecture, PiSSA is also compatible with quantization to further reduce the memory requirement of fine-tuning. Compared to QLoRA, QPiSSA exhibits smaller quantization errors in the initial stages. Fine-tuning LLaMA-3-70B on GSM8K, QPiSSA attains an accuracy of 86.05%, exceeding the performances of QLoRA at 81.73%. Leveraging a fast SVD technique, PiSSA can be initialized in only a few seconds, presenting a negligible cost for transitioning from LoRA to PiSSA. Code is available at https://github.com/GraphPKU/PiSSA.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

PiSSA LoRA LLM微调参数高效微调 QPiSSA

相关文章

A Paradigm Shift: MoRA’s Role in Advancing Parameter-Efficient Fine-Tuning Techniques

Tendou Karen - GAMERS!

Umaru Doma | Himouto! Umaru-chan | 干物妹!うまるちゃん

FULLY_EXTREME_MODIFIED

Camellya PDXL | Wuthering Waves

水彩watercolor(A3.1)

Cure Melody キュアメロディ / Houjou Hibiki 北条響 | Suite Pretty Cure♪ スイートプリキュア♪

BT Fluffy Crop Top

Hiiragi Shinoa / 柊シノア - Seraph of the End / 終わりのセラフ Animagine XL 3.1