List of AI safety papers from companies, 2023

Published on January 15, 2025 6:00 PM GMT

I'm collecting (x-risk-relevant) safety research from frontier AI companies published in 2023 and 2024: https://docs.google.com/spreadsheets/d/10_dzImDvHq7eEag6paK6AmIdAGMBOA7yXUvumODhZ5U/edit?usp=sharing.

I was planning to get AI safety researchers to score each of the papers, so that we could compare the labs on quality-adjusted safety research output. I'm giving up on this for now, largely because I expect to struggle to find scorers. Let me know if you want to collaborate on this.

I kinda hope to build on this to

Inform the safety community about labs' published research,Make the basic situation widely legible, andIncentivize labs to publish more good safety research / help internalize the positive externality of publishing good safety research,

but I probably won't get around to it.

If you see something that seems wrong—missing,^[1] poorly categorized, credit assignment nuances, whatever—please DM me, comment in the spreadsheet, comment below, or make a copy and comment on it and share that with me. The spreadsheet is currently unreliable.

Thanks to Oscar Delaney and Oliver Guest for help finding some papers. My spreadsheet is partially based on theirs. I see my collection as improving on theirs; the main difference is I'm more picky or opinionated or focused on x-risk.

Disclaimers:

Generally what's included vs excluded is somewhat inconsistent and definitely unclear. This is pretty bad.

value

boost external safety research

^{^}
Except collaborations. I currently mostly ignore collaborations, including MATS. But feel free to mention particularly noteworthy collaborations, or exhaustive-ish lists for me to link to.

Discuss

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签