可扩展监督_Fishai

热点

"可扩展监督" 相关文章

Research Areas in Evaluation and Guarantees in Reinforcement Learning (The Alignment Project by UK AISI)

少点错误 2025-08-01T19:16:02.000000Z

Research Areas in Learning Theory (The Alignment Project by UK AISI)

少点错误 2025-08-01T10:43:07.000000Z

Research Areas in Computational Complexity Theory (The Alignment Project by UK AISI)

少点错误 2025-08-01T10:43:07.000000Z

Research Areas in Benchmark Design and Evaluation (The Alignment Project by UK AISI)

少点错误 2025-08-01T10:43:06.000000Z

Research Areas in Cognitive Science (The Alignment Project by UK AISI)

少点错误 2025-08-01T10:43:06.000000Z

Rational Animations' video about scalable oversight and sandwiching

少点错误 2025-07-06T14:02:34.000000Z

Prover-Estimator Debate: A New Scalable Oversight Protocol

少点错误 2025-06-17T13:55:20.000000Z

MIT新研究量化AI监督挑战：控制比我们聪明的AI，成功率可能不足52%？

MIT 科技评论 - 本周热榜 2025-05-10T02:06:43.000000Z

UK AISI’s Alignment Team: Research Agenda

少点错误 2025-05-07T16:37:29.000000Z

AGI失控率＞90%！MIT教授算出「康普顿常数」，AI地球「夺权率」已锁定？

智源社区 2025-05-06T02:48:02.000000Z

Is weak-to-strong generalization an alignment technique?

少点错误 2025-01-31T07:17:53.000000Z

Balancing Label Quantity and Quality for Scalable Elicitation

少点错误 2024-10-24T17:23:37.000000Z

How should we make trade-offs between the quantity and quality of labels used for eliciting knowledge from capable AI systems?

少点错误 2024-10-24T16:53:07.000000Z

On scalable oversight with weak LLMs judging strong LLMs

少点错误 2024-07-08T09:05:26.000000Z

Oversharing Details of NYU’s Work on Implementing Debate as an Alignment Technique

少点错误 2024-07-06T20:50:08.000000Z

Scalable oversight as a quantitative rather than qualitative problem

少点错误 2024-07-06T17:50:10.000000Z

用AI监督AI，OpenAI做到了用左脚踩右脚上天

36kr 2024-07-02T11:33:49.000000Z

GPT-4批评GPT-4实现「自我提升」，OpenAI前超级对齐团队又一力作被公开

36kr 2024-06-28T10:03:45.000000Z

Copyright © 2019 FISHAI.All Rights Reserved