cs.AI updates on arXiv.org 08月05日 19:10
VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文介绍了一种名为veomni的模块化、高效的全模态LLM训练框架,通过解耦通信与计算,实现高效的三维并行,并支持新模态的灵活集成,显著提升了全模态LLM的训练效率和可扩展性。

arXiv:2508.02317v1 Announce Type: cross Abstract: Recent advances in large language models (LLMs) have driven impressive progress in omni-modal understanding and generation. However, training omni-modal LLMs remains a significant challenge due to the heterogeneous model architectures required to process diverse modalities, necessitating sophisticated system design for efficient large-scale training. Existing frameworks typically entangle model definition with parallel logic, incurring limited scalability and substantial engineering overhead for end-to-end omni-modal training. %We present \veomni, a modular and efficient training framework to accelerate the development of omni-modal LLMs. \veomni introduces model-centric distributed recipes that decouples communication from computation, enabling efficient 3D parallelism on omni-modal LLMs. \veomni also features a flexible configuration interface supporting seamless integration of new modalities with minimal code change. %Using \veomni, a omni-modal mixture-of-experts (MoE) model with 30B parameters can be trained with over 2,800 tokens/sec/GPU throughput and scale to 160K context lengths via 3D parallelism on 128 GPUs, showcasing its superior efficiency and scalability for training large omni-modal LLMs.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

全模态LLM 训练框架 veomni 三维并行 模态集成
相关文章