Newsroom Anthropic 07月08日 02:24
The Need for Transparency in Frontier AI
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

文章强调,随着人工智能(AI)技术的快速发展,确保其安全、负责任和透明的开发至关重要。文章提出了一个针对大型AI系统开发者的透明度框架,旨在建立明确的披露要求,以保障安全实践。该框架旨在避免过度限制,鼓励创新,同时促进AI的积极应用。文章建议限制框架的应用范围,建立安全的开发框架,公开安全开发框架,发布系统卡,并通过禁止虚假陈述来保护举报人。文章认为,这种透明度方法有助于行业最佳实践,并为负责任的实验室如何训练模型奠定基础。

🛡️ 提议将AI透明度框架仅应用于构建最强大模型的最大型AI模型开发者,这些开发者通常具有高计算能力、高计算成本、卓越的评估性能以及巨大的年收入和研发投入。

⚙️ 建立安全的开发框架,要求大型AI模型开发者制定并公开其安全开发框架,详细说明如何评估和减轻模型中的不合理风险,包括化学、生物、放射性和核威胁,以及模型自主性错位造成的危害。

📢 公开安全开发框架,在AI公司维护的公共网站上披露安全开发框架,以便研究人员、政府和公众了解当前部署的AI模型,并附带自我认证,证明实验室符合其发布的安全开发框架的条款。

📄 发布系统卡,系统卡或其他文档应总结测试和评估程序、结果和所需的缓解措施,并在部署时公开披露,并在模型进行重大修订时更新,但应适当编辑以保护公共安全和模型的安全。

⚖️ 保护举报人,明确规定实验室对其框架合规性撒谎是违法行为,这使得现有的举报人保护措施能够适用,并确保执法资源集中在故意不当行为的实验室上。

Frontier AI development needs greater transparency to ensure public safety and accountability for the companies developing this powerful technology. AI is advancing rapidly. While industry, governments, academia, and others work to develop agreed-upon safety standards and comprehensive evaluation methods—a process that could take months to years—we need interim steps to ensure that very powerful AI is developed securely, responsibly, and transparently.

We are therefore proposing a targeted transparency framework, one that could be applied at the federal, state, or international level, and which applies only to the largest AI systems and developers while establishing clear disclosure requirements for safety practices.

Our approach deliberately avoids being heavily prescriptive. We recognize that as the science of AI continues to evolve, any regulatory effort must remain lightweight and flexible. It should not impede AI innovation, nor should it slow our ability to realize AI's benefits—including lifesaving drug discovery, swift delivery of public benefits, and critical national security functions. Rigid government-imposed standards would be especially counterproductive given that evaluation methods become outdated within months due to the pace of technological change.

Minimum Standards for AI Transparency

Below are the core tenets we believe should guide AI transparency policy:

    Limit Application to the Largest Model Developers: AI transparency should apply only to the largest frontier model developers that are building the most capable models - where frontier models are distinguished by a combination of thresholds for computing power, computing cost, evaluation performance, annual revenue and R&D. To avoid burdening the startup ecosystem and small developers with models at low risk to national security or for causing catastrophic harm, the framework should include appropriate exemptions for smaller developers. We welcome input from the start-up community on what those thresholds should be. Internally, we've discussed the following examples for what the threshold could look like: annual revenue cutoff amounts on the order of $100 million; or R&D or capital expenditures on the order of $1 billion annually. These scoping thresholds should be periodically reviewed as the technology and industry landscape evolves.Create a Secure Development Framework: Require covered frontier model developers to have a Secure Development Framework that lays out how they will assess and mitigate unreasonable risk in a model. Those risks must include the creation of chemical, biological, radiological and nuclear harms, as well as harms caused by misaligned model autonomy. Secure Development Frameworks are still an evolving safety tool, so any proposal should strive for flexibility. Make the Secure Development Framework Public: The Secure Development Framework should be disclosed to the public, subject to reasonable redaction protections for sensitive information, on a public-facing website registered to and maintained by the AI company. This will enable researchers, governments, and the public to stay informed about the AI models deployed today. The disclosure should come with a self-certification that the lab is complying with the terms of their published Secure Development Framework.Publish a System Card: System cards or other documentation should summarize the testing and evaluation procedures, results and mitigations required (subject to appropriate redaction for information that could compromise public safety or the safety and security of the model). The system card should also be publicly disclosed at deployment, and updated if the model is substantially revised. Protect Whistleblowers by Prohibiting False Statements: Explicitly make it a violation of law for a lab to lie about its compliance with its framework. This clarification creates a clear legal violation that enables existing whistleblower protections to apply and ensures that enforcement resources are squarely focused on labs that have engaged in purposeful misconduct. Transparency Standards: A workable AI transparency framework should have a minimum set of standards so that it can enhance security and public safety while accommodating the evolving nature of AI development. Given that AI safety and security practices remain in their early stages, with frontier developers like Anthropic actively researching best practices, any framework must be designed for evolution. Standards should begin as flexible, lightweight requirements that can adapt as consensus best practices emerge among industry, government, and other stakeholders.

This transparency approach sheds light on industry best practices for safety and can help set a baseline for how responsible labs train their models, ensuring developers meet basic accountability standards while enabling the public and policymakers to distinguish between responsible and irresponsible practices. For example, the Secure Development Framework we describe here is akin to Anthropic’s own Responsible Scaling Policy and others from leading labs (Google DeepMind, OpenAI, Microsoft), all of whom have already implemented similar approaches while releasing frontier models. Putting a Secure Development Framework transparency requirement into law would not only standardize industry best practices without setting them in stone, it would also ensure that the disclosures (which are now voluntary) could not be withdrawn in the future as models become more powerful.

Views differ on whether and when AI models could pose catastrophic risks. Transparency requirements for Secure Development Frameworks and system cards could help give policymakers the evidence they need to determine if further regulation is warranted, as well as provide the public with important information about this powerful new technology.

As models advance, we have an unprecedented opportunity to accelerate scientific discovery, healthcare, and economic growth. Without safe and responsible development, a single catastrophic failure could halt progress for decades. Our proposed transparency framework offers a practical first step: public visibility into safety practices while preserving private sector agility to deliver AI's transformative potential.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

人工智能 透明度 安全框架 AI发展
相关文章