热点
"可扩展监督" 相关文章
Is weak-to-strong generalization an alignment technique?
少点错误 2025-01-31T07:17:53.000000Z
Balancing Label Quantity and Quality for Scalable Elicitation
少点错误 2024-10-24T17:23:37.000000Z
How should we make trade-offs between the quantity and quality of labels used for eliciting knowledge from capable AI systems?
少点错误 2024-10-24T16:53:07.000000Z
On scalable oversight with weak LLMs judging strong LLMs
少点错误 2024-07-08T09:05:26.000000Z
Oversharing Details of NYU’s Work on Implementing Debate as an Alignment Technique
少点错误 2024-07-06T20:50:08.000000Z
Scalable oversight as a quantitative rather than qualitative problem
少点错误 2024-07-06T17:50:10.000000Z
用AI监督AI,OpenAI做到了用左脚踩右脚上天
36kr 2024-07-02T11:33:49.000000Z
GPT-4批评GPT-4实现「自我提升」,OpenAI前超级对齐团队又一力作被公开
36kr 2024-06-28T10:03:45.000000Z