热点
"训练过程" 相关文章
142页长文揭秘DeepSeek-R1「思维大脑」!开启全新「思维链学」研究
智源社区 2025-04-23T05:48:03.000000Z
7B扩散LLM,居然能跟671B的DeepSeek V3掰手腕,扩散vs自回归,谁才是未来?
机器之心 2025-04-05T07:57:03.000000Z
喝点VC|a16z关于DeepSeek的内部复盘:推理模型革新与20倍算力挑战下的AI模型新格局
Z Potentials 2025-03-23T08:12:41.000000Z
Allen Institute for AI (AI2) Releases OLMo 32B: A Fully Open Model to Beat GPT 3.5 and GPT-4o mini on a Suite of Multi-Skill Benchmarks
MarkTechPost@AI 2025-03-14T22:47:10.000000Z
颠覆LLM格局!AI2新模型OLMo2,训练过程全公开,数据架构双升级
新智元 2025-01-25T17:07:25.000000Z
ReaderLM v2:HTML 转 Markdown 和 JSON 的前沿小型语言模型
Jina AI 2025-01-19T16:44:20.000000Z
微软华人团队最新研究:从LLM到LAM,让大模型真正具有「行动力」!
新智元 2025-01-14T16:13:57.000000Z
Meet HuatuoGPT-o1: A Medical LLM Designed for Advanced Medical Reasoning
MarkTechPost@AI 2024-12-30T17:51:24.000000Z
Meta’s COCONUT: The AI Method That Thinks Without Language
Unite.AI 2024-12-16T15:54:20.000000Z
AMD 推出自家首款小语言 AI 模型“Llama-135m ”,主打“推测解码”能力可减少 RAM 占用
IT之家 2024-09-29T09:23:30.000000Z
LLaVaOLMoBitnet1B: The First Ternary Multimodal LLM Capable of Accepting Image(s) and Text Inputs to Produce Coherent Textual Response
MarkTechPost@AI 2024-09-03T09:20:13.000000Z
DIY RLHF: A simple implementation for hands on experience
少点错误 2024-07-10T12:20:40.000000Z