热点
"自动评估" 相关文章
ASTRID -- An Automated and Scalable TRIaD for the Evaluation of RAG-based Clinical Question Answering Systems
cs.AI updates on arXiv.org 2025-07-21T04:06:49.000000Z
Learn Globally, Speak Locally: Bridging the Gaps in Multilingual Reasoning
cs.AI updates on arXiv.org 2025-07-09T04:01:41.000000Z
DeepMind放出新AI大招:竟能自我纠错 但有致命弱点
Cnbeta 2025-05-15T02:22:36.000000Z
Can LLMs Design Good Questions Based on Context? This AI Paper Evaluates Questions Generated by LLMs from Context, Comparing Them to Human-Generated Questions
MarkTechPost@AI 2025-01-11T01:13:18.000000Z
自动评估基准 | 一些评估测试集
智源社区 2025-01-09T05:07:26.000000Z
自动评估基准 | 技巧与提示
智源社区 2024-12-28T05:01:57.000000Z
自动评估基准 | 设计你的自动评估任务
智源社区 2024-12-26T13:19:18.000000Z
自动评估基准 | 设计你的自动评估任务
Hugging Face 2024-12-26T10:39:17.000000Z
OLAPH: A Simple and Novel AI Framework that Enables the Improvement of Factuality through Automatic Evaluations
MarkTechPost@AI 2024-05-27T14:30:55.000000Z