评估方法_Fishai

热点

"评估方法" 相关文章

让 LLM 来评判 | 技巧与提示

Hugging Face 2025-05-13T16:51:55.000000Z

The Second Half：一位 OpenAI 科学家的 AI 下半场启示录

海外独角兽 2025-04-19T06:21:46.000000Z

警惕AI“罕见”危险行为

虎嗅-AI 2025-02-27T09:20:08.000000Z

让 LLM 来评判 | 奖励模型相关内容

Hugging Face 2025-02-14T17:15:15.000000Z

直播｜LLM-as-a-Judge热门论文，当AI变成“判官”综述分享，AI+金融圆桌交流，IDEA研究院

智源社区 2025-01-14T09:05:19.000000Z

Assessment in Computer Science Education in the GenAI Era

Communications of the ACM - Artificial Intelligence 2025-01-10T16:17:07.000000Z

From Contradictions to Coherence: Logical Alignment in AI Models

MarkTechPost@AI 2025-01-09T06:34:43.000000Z

DeepMind Research Introduces The FACTS Grounding Leaderboard: Benchmarking LLMs’ Ability to Ground Responses to Long-Form Input

MarkTechPost@AI 2025-01-08T04:25:14.000000Z

你的专属“钢铁侠”助手OS Agents来了！浙大联手OPPO、零一万物等10个机构推出全新综述

量子位 2025-01-06T07:58:05.000000Z

This AI Paper Introduces LLM-as-an-Interviewer: A Dynamic AI Framework for Comprehensive and Adaptive LLM Evaluation

MarkTechPost@AI 2025-01-04T01:11:13.000000Z

New Evals for Better Models, AI Research Papers Made Easier to Understand, Train Your Own Flux LoRA, and More

Society's Backend 2024-12-13T06:24:24.000000Z

Red Teaming for AI: Strengthening Safety and Trust through External Evaluation

MarkTechPost@AI 2024-11-26T07:49:56.000000Z

Researchers at Peking University Introduce A New AI Benchmark for Evaluating Numerical Understanding and Processing in Large Language Models

MarkTechPost@AI 2024-11-09T08:19:46.000000Z

大模型也冲“奥斯卡”：港科大腾讯等提出AI角色扮演全景综述，四方面剖析关键细节

智源社区 2024-11-04T05:08:10.000000Z

Sabotage Evaluations for Frontier Models

少点错误 2024-10-18T22:38:04.000000Z

OpenAI 最新 53 页论文：ChatGPT 看人下菜碟，对“小美”和“小帅”回答不一致

IT之家 2024-10-16T06:23:42.000000Z

Exposing Vulnerabilities in Automatic LLM Benchmarks: The Need for Stronger Anti-Cheating Mechanisms

MarkTechPost@AI 2024-10-13T12:51:10.000000Z

Nature：连诺奖都拿了的AI，能像人类一样拥有常识吗？

智源社区 2024-10-11T14:53:56.000000Z

连诺奖都拿了的AI，能像人类一样拥有常识吗？

虎嗅 2024-10-11T01:24:06.000000Z

一篇大模型NL2SQL框架全栈技术综述

PaperAgent 2024-10-04T12:38:03.000000Z

Copyright © 2019 FISHAI.All Rights Reserved