少点错误 05月05日 21:37
Legal Supervision of Frontier AI Labs is the answer.
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

文章探讨了人工智能(AI)系统内部部署带来的安全隐患,特别是当AI实验室选择不公开其最先进模型时。文章认为,为了应对这种趋势,需要建立独立的法律监督机构来审查前沿AI实验室。这种监督机构应具备检查系统、测试风险、在部署前干预等权力,以确保AI安全。文章还分析了现有的监管方法,如内部治理和欧盟AI法案,并强调了在美国建立类似监督机构的必要性,以应对AI技术发展带来的潜在风险。

🛡️ AI内部部署带来的风险日益凸显。当AI实验室选择不公开其最先进模型时,现有的监管框架可能会失效,因为这些模型从未与公众互动,导致监管缺位。

⚖️ 文章主张建立独立的法律监督机构。该机构应具备检查、测试和干预的权力,类似于欧盟AI法案,但需在美国等AI前沿实验室所在地设立,以确保对AI开发的早期阶段进行监管。

⚠️ 内部治理的局限性被指出。文章认为,仅依靠实验室内部的治理结构,如董事会或长期治理信托,无法充分解决问题,因为它们可能信息不透明,难以早期发现问题。

Published on May 5, 2025 1:36 PM GMT

If the biggest threat model from AI systems comes from internal deployment, then the correct governance move is to establish independent legal supervisors for frontier AI labs[1].


Steven Adler recently argued against relying on a "race to the top," where frontier labs compete to be the safest when deploying models.

‘A race to the top can improve AI safety, but it doesn’t solve the ‘adoption problem’: getting all relevant developers to adopt safe enough practices.’

We should be sceptical of actors racing to build what could become the most powerful technology in human history and then saying they'll compete to be safe. There’s already plenty of evidence for this. Google didn’t release model evaluations for Gemini 2.5 until after they were publicly criticised. It is unclear whether they ever intended to do so. OpenAI, meanwhile, is attempting to convert its non-profit structure and has made several other choices that undermine safety.

If a serious race to the top were truly underway, we might as well go home.

But we cannot go home.

If these labs can’t be trusted, then someone else must keep them in check. So where are the checks and balances for AI labs?

Internal structures like boards or long-term governance trusts do have some power to provide oversight. But they are too far removed to catch problems early, and too reliant on information the lab chooses to share.

What is needed is a higher standard of scrutiny: deeper inspection, independent testing, and people with direct access to the models and their behaviour.

This level of oversight requires more than internal governance. It calls for an independent body with a clear legal mandate. Not friendly auditors chosen by the labs, but supervisors with the authority to inspect systems, test for risks like scheming, and intervene before deployment. They must have the power to file warrants, demand documentation, and compel access when necessary. Legislation should grant these powers and set clear boundaries for how they are used.

There is good reason to be sceptical of applying existing legal frameworks to frontier AI. Taking something off the shelf, like liability or contract law, probably will not work by default. And we should also be sceptical that the law can keep pace with how quickly this technology is moving. 'Statutory law and case law never keep up with Moore’s law.'

But that is not a reason to give up on legal approaches altogether. It is a reason to ask how the law might be reworked to meet the scale and urgency of the challenge.

In this world, the case for legal supervision becomes clearer. Supervisors should have the authority to act when necessary to keep this technology safe. That includes access to training runs, compulsory evaluations, compute-stop orders, civil fines, and the ability to seek emergency injunctive relief


A more immediate reason to introduce supervision is the shift toward internal-only deployment. Labs may choose not to release their most capable models at all. Instead, they can publish weaker systems to maintain an income stream, while using more powerful models internally to accelerate research toward superintelligence. This approach reduces legal risk, keeps competitors in the dark, and centralises control.

As deployment becomes more internal, secrecy is likely to grow. Labs may stop sharing models with external evaluators entirely. If that happens, the UK AI Security Institute will lose meaningful access. So will groups like Apollo Research and METR, whose evaluations depend on cooperation from the labs.

A deceptive model may evade detection, especially without external testing. Without independent scrutiny, dangerous behaviours can go unnoticed. Concentrating this level of capability in one organisation without oversight, poses risks not just to the lab, but to humanity too.

Apollo Research's paper addresses the risks of internal deployment and proposes two internal oversight bodies for labs that choose not to release their models publicly.

The first is the Internal Deployment Team (IDT), a technical group within the lab tasked with testing models, installing access limits and other safeguards, and monitoring their use. The second is the Internal Deployment Overseeing Board (IDOB), a board appointed by the lab to approve or block deployments, set usage rules, and audit the IDT’s enforcement.

Apollo suggests that the IDT could be made up of existing staff or new hires selected by the lab. This approach should raise concerns. It accepts that labs can supervise themselves. At best, this could be part of a broader solution. But there must also be independent supervision from the state. It is unwise to trust a system that relies on AI labs to self-regulate in matters of safety and security.

The EU AI Office offers one example of what state-level supervision could look like. Under the EU AI Act, it can demand documentation, carry out independent evaluations, mandate risk mitigation, recall or withdraw models, and issue fines of up to 15 million euros or 3 percent of global turnover.

This is a step in the right direction. But the Act only applies when a model affects people in the EU. If a system is developed and deployed entirely within a lab, without ever interacting with EU citizens, the law does not apply.

The more pressing concern is what happens when powerful models are never released. If labs keep their most advanced systems internal, many governance approaches may never be triggered. Legal frameworks built around public deployment could become irrelevant. This is why supervision needs to happen earlier. It must be proactive. It must take place during development. And it must exist in the United States, where the frontier labs are based.


Some important questions remain. How will supervisors know what to look for? How will they know when to intervene, or when to call out dangerous behaviour inside a lab?[2]

But the core claim stands. The path to superintelligence carries serious risks, and supervision is one way of trying to contain them. When labs no longer have an incentive to release their models publicly, there still needs to be a watchdog.

  1. ^

    See Peter Wills’ paper for more on operationalising supervisors.

  2. ^

    An additional question is what supervision looks like once advanced systems begin doing AI R&D themselves. If developers are replaced by highly capable agents improving their own designs, what exactly are supervisors overseeing? How do you supervise the agents?



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI监管 内部部署 独立监督
相关文章