Published on April 29, 2025 4:13 PM GMT

Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required.

In this edition: Experts and ex-employees urge the Attorneys General of California and Delaware to block OpenAI’s for-profit restructure; CAIS announces the winners of its safety benchmarking competition.

Listen to the AI Safety Newsletter for free on Spotify or Apple Podcasts.

Subscribe to receive future versions.

An Open Letter Attempts to Block OpenAI Restructuring

A group of former OpenAI employees and independent experts published an open letter urging the Attorneys General (AGs) of California (where OpenAI operates) and Delaware (where OpenAI is incorporated) to block OpenAI’s planned restructuring into a for-profit entity. The letter argues the move would fundamentally undermine the organization's charitable mission by jeopardizing the governance safeguards designed to protect control over AGI from profit motives.

OpenAI was founded with the charitable purpose to ensure that artificial general intelligence benefits all of humanity. OpenAI’s original nonprofit structure, and later its capped-profit model, were designed to control profit motives in the development of AGI, which OpenAI defines as "highly autonomous systems that outperform humans at most economically valuable work." The structure was designed to prevent profit motives from incentivizing OpenAI to take risky development decisions and divert much of the wealth produced by AGI to private shareholders.

The proposed restructuring into a Public Benefit Corporation (PBC) would dismantle the governance safeguards OpenAI originally championed. The letter highlights that the proposed restructuring would transfer control away from the nonprofit entity–whose primary fiduciary duty is to humanity–to a for-profit board whose directors would be partly beholden to shareholder interests. The authors detail several specific safeguards currently in place that would be undermined or eliminated:

Subordination of Profit Motives:

Nonprofit Fiduciary Duty:

Capped Investor Profits:

Reports suggest

Independent Board:

AGI Belongs to Humanity:

Stop-and-Assist Commitment:

The letter concludes by asking the Attorneys General of California and Delaware to halt the restructuring and protect OpenAI’s charitable mission. The authors argue that transferring control of potentially the most powerful technology ever created to a for-profit entity fundamentally contradicts OpenAI's charitable obligations. They urge the AGs to use their authority to investigate the proposed changes and ensure that the governance structures prioritizing public benefit over private gain remain intact.

SafeBench Winners

CAIS recently concluded its SafeBench competition, which awarded prizes for new benchmarks for assessing and reducing risks from AI. Sponsored by Schmidt Sciences, the competition awarded $250,000 across eight winning submissions.

The competition focused on four key areas—Robustness, Monitoring, Alignment, and Safety Applications—attracting nearly one hundred submissions. A panel of judges evaluated submissions based on the clarity of safety assessment, the potential benefit of progress on the benchmark, and the ease of evaluating measurements.

Three Benchmarks Awarded First Prize. Three submissions each received first prizes of $50,000 each for their applicability to frontier models, relevance to current safety challenges, and use of large datasets.

Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risks of Language Models

AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM Agents

BackdoorLLM: A Comprehensive Benchmark for Backdoor Attacks on Large Language Models

Five Benchmarks Recognized with Second Prize. Five additional submissions were awarded $20,000 each for their innovative approaches to evaluating specific AI safety risks.

CVE-Bench: A Benchmark for AI Agents’ Ability to Exploit Real-World Web Application Vulnerabilities

JailBreakV: A Benchmark for Assessing the Robustness of MultiModal Large Language Models against Jailbreak Attacks

Poser: Unmasking Alignment Faking LLMs by Manipulating Their Internals

Me, Myself, and AI: The Situational Awareness Dataset (SAD) for LLMs

BioLP-bench: Measuring understanding of biological lab protocols by large language models

These benchmarks provide crucial tools for understanding the progress of AI, evaluating risks, and ultimately reducing potential harms. The papers, code, and datasets for all winning benchmarks are publicly available for further research and use. CAIS hopes to see future work which is inspired by or builds on these submissions.