Interconnects 06月26日 22:27
Latest open artifacts (#11): Visualizing China's open models market share, Arcee's models, and VLAs for robotics
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

文章探讨了中国模型在开放模型领域的主导地位,并分析了其对开源生态的影响。研究表明,虽然西方组织发布的模型数量较多,但许多依赖于中国模型,尤其是Qwen。中国模型在性能和许可方面具有优势,对开放生态系统的发展方向产生了重要影响。文章还介绍了多个值得关注的开放模型,包括数据集、图像/视频生成模型以及推理模型等,并强调了开源社区在推动技术进步中的作用。

🇨🇳 **中国模型占据主导地位:** 尽管西方组织发布了大部分的开放模型,但这些模型往往依赖于中国模型,例如Qwen。中国模型在性能和许可方面具有优势,对开放生态系统产生了重要影响。

💡 **开源数据集的重要性:** EleutherAI发布的“Common Pile”数据集,拥有明确的许可,涵盖了8TB的数据,包括代码、法律文件和公共领域书籍。这种做法与依赖“合理使用”原则的其他数据集不同,对于开源模型的发展具有重要价值。

🚀 **新兴模型的涌现:** Arcee AI推出了4.5B参数模型,并开放了其之前的闭源模型。moondream2团队发布了首个推理模型,并在目标检测任务中表现出色。Qwen也进入了检索领域,发布了自己的模型。MiniMaxAI发布了基于混合注意力MoE架构的推理模型,而Mistral则推出了基于Mistral Small的开放模型。

📝 **推理能力的提升:** 许多模型发布都将推理能力纳入训练流程。MiniMaxAI发布的推理模型拥有高达80K的思考预算,尽管在测试中可能存在过度思考的问题。

In previous posts, we've noted in text how Chinese models currently dominate the space of open models. We analyzed the geographic distribution of all models from past Artifacts collections to quantify it. It turns out that most of the artifacts are released by Western organizations (~60%), but most of these rely on Chinese models (i.e. Qwen). Crucially, on top of counting, Chinese models also have been qualitatively more impactful on the direction of the open ecosystem by releasing models closest to the frontier of performance with the most permissive licenses.

We present only a selection of models in the Artifacts series based on a mix of our perceived immediate and long-term impact. Our analysis of is broader than just text-only language models, including image/video generation models where Chinese labs are more dominant.

For attribution, we count fine-tunes according to the team that released them, i.e., a Qwen fine-tune published by a Western company is marked as Western.

When looking into the models heritage (as reported by HuggingFace), the picture is as expected: Qwen is the first choice for anyone who fine-tunes their model.1

Also, RL and reasoning is now part of a lot of the model releases as part of the training pipeline. Therefore, we stop making reasoning its own category. Together with links being broken out into its own series, these posts should have a more streamlined structure. Our picks, then models, then datasets.

Share

Our Picks

Models

Flagship

Read more

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

中国模型 开放模型 开源生态 Qwen 推理模型
相关文章