cs.AI updates on arXiv.org 12小时前
FlashCommunication V2: Bit Splitting and Spike Reserving for Any Bit Communication
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文提出一种名为FlashCommunication V2的通信范式,通过位分割和尖峰保留技术解决大语言模型分布式训练中的通信瓶颈,显著提高通信系统的灵活性和资源利用率,在NVLink和PCIe架构上实现速度提升。

arXiv:2508.03760v1 Announce Type: cross Abstract: Nowadays, communication bottlenecks have emerged as a critical challenge in the distributed training and deployment of large language models (LLMs). This paper introduces FlashCommunication V2, a novel communication paradigm enabling efficient cross-GPU transmission at arbitrary bit widths. Its core innovations lie in the proposed bit splitting and spike reserving techniques, which address the challenges of low-bit quantization. Bit splitting decomposes irregular bit widths into basic units, ensuring compatibility with hardware capabilities and thus enabling transmission at any bit width. Spike reserving, on the other hand, retains numerical outliers (i.e., minima and maxima) as floating-point numbers, which shrinks the dynamic numerical range and pushes the quantization limits to 2-bit with acceptable losses. FlashCommunication V2 significantly enhances the flexibility and resource utilization of communication systems. Through meticulous software-hardware co-design, it delivers robust performance and reduced overhead across both NVLink-based and PCIe-based architectures, achieving a maximum 3.2$\times$ speedup in AllReduce and 2$\times$ in All2All communication.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

大语言模型 通信效率 FlashCommunication V2 位分割 尖峰保留
相关文章