少点错误 01月16日
List of AI safety papers from companies, 2023–2024
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

作者收集2023和2024年前沿AI公司的安全研究,计划请研究员评分以比较各实验室成果,但因难找评分者而放弃。作者希望借此让安全社区了解实验室研究情况等,但可能无法实现。还提到了一些问题及感谢他人的帮助。

🎯作者收集前沿AI公司2023和2024年安全研究

📋计划请研究员评分以比较实验室成果,后放弃

💡希望让安全社区了解研究情况并激励实验室

⚠️提到存在的一些问题,如分类不准确等

Published on January 15, 2025 6:00 PM GMT

I'm collecting (x-risk-relevant) safety research from frontier AI companies published in 2023 and 2024: https://docs.google.com/spreadsheets/d/10_dzImDvHq7eEag6paK6AmIdAGMBOA7yXUvumODhZ5U/edit?usp=sharing.


I was planning to get AI safety researchers to score each of the papers, so that we could compare the labs on quality-adjusted safety research output. I'm giving up on this for now, largely because I expect to struggle to find scorers. Let me know if you want to collaborate on this.

I kinda hope to build on this to

but I probably won't get around to it.

If you see something that seems wrong—missing,[1] poorly categorized, credit assignment nuances, whatever—please DM me, comment in the spreadsheet, comment below, or make a copy and comment on it and share that with me. The spreadsheet is currently unreliable.

Thanks to Oscar Delaney and Oliver Guest for help finding some papers. My spreadsheet is partially based on theirs. I see my collection as improving on theirs; the main difference is I'm more picky or opinionated or focused on x-risk.


Disclaimers:

  1. ^

    Except collaborations. I currently mostly ignore collaborations, including MATS. But feel free to mention particularly noteworthy collaborations, or exhaustive-ish lists for me to link to.



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI安全研究 实验室成果 评分困难 存在问题
相关文章