热点
"混合专家模型" 相关文章
Innovator: Scientific Continued Pretraining with Fine-grained MoE Upcycling
cs.AI updates on arXiv.org 2025-07-28T04:42:45.000000Z
Qwen Releases Qwen3-Coder-480B-A35B-Instruct: Its Most Powerful Open Agentic Code Model Yet
MarkTechPost@AI 2025-07-23T03:54:43.000000Z
[推广] Kimi k2 online
V2EX 2025-07-13T09:24:39.000000Z
[推广] Kimi k2 online
V2EX 2025-07-13T08:28:17.000000Z
MoSE: Skill-by-Skill Mixture-of-Expert Learning for Autonomous Driving
cs.AI updates on arXiv.org 2025-07-11T04:03:58.000000Z
每日互动,Deepseek R2有望本月发布,二波可期
韭研公社 2025-07-11T02:42:14.000000Z
Efficient Training of Large-Scale AI Models Through Federated Mixture-of-Experts: A System-Level Approach
cs.AI updates on arXiv.org 2025-07-09T04:01:47.000000Z
City-Level Foreign Direct Investment Prediction with Tabular Learning on Judicial Data
cs.AI updates on arXiv.org 2025-07-09T04:01:27.000000Z
盘古团队回应抄袭指控:并非基于其他模型训练 已标注开源代码版权声明
Cnbeta 2025-07-05T09:55:35.000000Z
华为盘古首次露出,昇腾原生72B MoE架构,SuperCLUE千亿内模型并列国内第一
掘金 人工智能 2025-05-28T09:08:02.000000Z
The Rise of Mixture-of-Experts: How Sparse AI Models Are Shaping the Future of Machine Learning
Unite.AI 2025-05-06T23:22:34.000000Z
消息称DeepSeek R2下月发:成本较GPT降97%、华为芯片性能不输英伟达
最新-新浪科技科学探索 2025-04-29T14:13:27.000000Z
Qwen3强势来袭:推理力爆表、语言超百种、智能体协作领先,引领AI开源大模型
掘金 人工智能 2025-04-29T07:28:04.000000Z
消息称DeepSeek R2下月发:成本较GPT降97%、华为芯片性能不输英伟达
快科技资讯 2025-04-29T01:16:24.000000Z
Alibaba Qwen Team Just Released Qwen3: The Latest Generation of Large Language Models in Qwen Series, Offering a Comprehensive Suite of Dense and Mixture-of-Experts (MoE) Models
MarkTechPost@AI 2025-04-29T01:10:35.000000Z
Multimodal Models Don’t Need Late Fusion: Apple Researchers Show Early-Fusion Architectures are more Scalable, Efficient, and Modality-Agnostic
MarkTechPost@AI 2025-04-14T22:20:29.000000Z
Meta公布MoE架構開發的Llama 4 開源4000億、1090億參數的Maverick
AI & Big Data 2025-04-07T01:28:48.000000Z
消息称蚂蚁集团采用阿里、华为等国产芯片训练 AI:性能匹敌英伟达 H800,成本降低 20%
IT之家 2025-03-24T06:12:49.000000Z
Observations About LLM Inference Pricing
少点错误 2025-03-04T03:04:11.000000Z
DeepSeek开源放大招:FlashMLA让H800算力狂飙!曝光低成本秘笈
智源社区 2025-02-25T03:18:07.000000Z