训练过程_Fishai

热点

"训练过程" 相关文章

On the Principles of ReLU Networks with One Hidden Layer

cs.AI updates on arXiv.org 2025-07-14T04:08:25.000000Z

142页长文揭秘DeepSeek-R1「思维大脑」！开启全新「思维链学」研究

智源社区 2025-04-23T05:48:03.000000Z

7B扩散LLM，居然能跟671B的DeepSeek V3掰手腕，扩散vs自回归，谁才是未来？

机器之心 2025-04-05T07:57:03.000000Z

喝点VC｜a16z关于DeepSeek的内部复盘：推理模型革新与20倍算力挑战下的AI模型新格局

Z Potentials 2025-03-23T08:12:41.000000Z

Allen Institute for AI (AI2) Releases OLMo 32B: A Fully Open Model to Beat GPT 3.5 and GPT-4o mini on a Suite of Multi-Skill Benchmarks

MarkTechPost@AI 2025-03-14T22:47:10.000000Z

颠覆LLM格局！AI2新模型OLMo2，训练过程全公开，数据架构双升级

新智元 2025-01-25T17:07:25.000000Z

ReaderLM v2：HTML 转 Markdown 和 JSON 的前沿小型语言模型

Jina AI 2025-01-19T16:44:20.000000Z

微软华人团队最新研究：从LLM到LAM，让大模型真正具有「行动力」！

新智元 2025-01-14T16:13:57.000000Z

Meet HuatuoGPT-o1: A Medical LLM Designed for Advanced Medical Reasoning

MarkTechPost@AI 2024-12-30T17:51:24.000000Z

Meta’s COCONUT: The AI Method That Thinks Without Language

Unite.AI 2024-12-16T15:54:20.000000Z

AMD 推出自家首款小语言 AI 模型“Llama-135m ”，主打“推测解码”能力可减少 RAM 占用

IT之家 2024-09-29T09:23:30.000000Z

LLaVaOLMoBitnet1B: The First Ternary Multimodal LLM Capable of Accepting Image(s) and Text Inputs to Produce Coherent Textual Response

MarkTechPost@AI 2024-09-03T09:20:13.000000Z

DIY RLHF: A simple implementation for hands on experience

少点错误 2024-07-10T12:20:40.000000Z

Copyright © 2019 FISHAI.All Rights Reserved