少点错误 05月29日 00:02
AISN #56: Google Releases Veo 3
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本期AI安全快讯聚焦近期AI领域的重要进展,包括谷歌发布新一代视频生成模型Veo 3,以及Anthropic的Claude Opus 4在自愿治理方面引发的担忧。文章探讨了Veo 3的性能提升及其潜在影响,并深入分析了Anthropic在模型安全方面的承诺变化,强调了在AI快速发展的背景下,自愿治理的脆弱性。此外,快讯还简要介绍了其他AI领域的动态,如Gemini系列模型的升级、OpenAI收购Jony Ive的初创公司等。

📹 谷歌发布了新视频生成模型Veo 3,该模型在人类偏好基准测试中表现出色,能够生成高质量的视频和音频。Veo 3在视觉效果上取得了显著进步,可能标志着视频生成技术从“有趣玩具”向“真正有用”的转变。

⚠️ Anthropic的Claude Opus 4模型在发布前,Anthropic放弃了多项安全承诺,引发了对自愿治理有效性的质疑。Opus 4展现出潜在的危险双重用途,例如在协助恶意行为者获取生物武器方面表现出明显的能力提升。

🛡️ 尽管Anthropic为Opus 4推出了ASL-3安全保护措施,但早期反馈表明这些保护可能不足。例如,有研究人员发现,Claude Opus 4的WMD安全防护措施可以被绕过,从而生成了15页关于生产沙林的详细说明。

🔄 Anthropic在发布Opus 4之前,修改了其安全和安保政策,例如,取消了对ASL-4“预警信号评估”的详细定义要求。同时,Anthropic还放松了ASL-3安全要求,不再要求防止员工窃取模型权重。

Published on May 28, 2025 4:00 PM GMT

Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required.

In this edition: Google released a frontier video generation model at its annual developer conference; Anthropic’s Claude Opus 4 demonstrates the danger of relying on voluntary governance.

Listen to the AI Safety Newsletter for free on Spotify or Apple Podcasts.

Subscribe to receive future versions.


Google Releases Veo 3

Last week, Google made several AI announcements at I/O 2025, its annual developer conference. An announcement of particular note is Veo 3, Google’s newest video generation model.

Frontier video and audio generation. Veo 3 outperforms other models on human preference benchmarks, and generates both audio and video.

Google showcasing a video generated with Veo 3. (Source)

If you just look at benchmarks, Veo 3 is a substantial improvement over other systems. But relative benchmark improvement only tells part of the story—the absolute capabilities of systems ultimately determine their usefulness. Veo 3 looks like a marked qualitative improvement over other models—it generates video and audio with extreme faithfulness, and we recommend you see some examples for yourself. Veo 3 may represent the point video generation crosses the line between being an interesting toy and being genuinely useful.

Other announcements at I/O 2025. Other highlights from the conference include:

AI is here to stay. AI use is sometimes driven by trends—for example, ChatGPT added a million users in an hour during the ‘Ghiblification’ craze. However, as AI systems become genuinely useful across more tasks, they will become ubiquitous and enduring. Google’s Gemini app now has 400M monthly active users, and its AI products now process over 480 trillion tokens a month—up from 9.7 trillion last year.

Opus 4 Demonstrates the Fragility of Voluntary Governance

Last week, Anthropic released Claude Opus 4 and Claude Sonnet 4. Both exhibit broadly frontier performance, and lead the field on coding benchmarks. Claude Opus 4 is also Anthropic’s first model to meet its ASL-3 safety measure, which designates models that pose substantial risk. However, Anthropic rolled back several safety and security commitments prior to releasing Opus 4, demonstrating that voluntary governance is not to be relied on.

Opus 4 exhibits hazardous dual-use capabilities. In one result from its system card, Opus 4 provides a clear uplift in trials measuring its ability to help malicious actors acquire biological weapons.

Anthropic’s Chief Scientist Jarad Kaplan told TIME that malicious actors could use Opus 4 to “try to synthesize something like COVID or a more dangerous version of the flu—and basically, our modeling suggests that this might be possible.” It’s not just Opus 4: several frontier models outperform human experts in dual-use virology tests.

The system card also reports that Apollo Research found an early Claude Opus 4 version exhibited "scheming and deception," advising against its release. Anthropic says it implemented internal fixes; however, it doesn’t appear that Anthropic had Apollo Research re-evaluate the final, released version.

Anthropic’s safety protections may be insufficient. In light of Opus 4’s dangerous capabilities, Anthropic rolled out ASL-3 safety protections. However, early public response to Opus 4 indicates that those protections might be insufficient. For example, one researcher showed that Claude Opus 4's WMD safeguards can be bypassed to generate over 15 pages of detailed instructions for producing sarin gas.

Anthropic walked back safety and security commitments prior to Opus 4’s release. Anthropic has also faced criticism for walking back safety commitments prior to Opus 4’s release. For example, Anthropic’s September 2023 Responsible Scaling Policy (RSP) committed to define detailed ASL-4 "warning sign evaluations" before their systems reached ASL-3 capabilities; however, it hadn’t done so at the time of Opus 4’s release. This is because Anthropic redlined that requirement in an October 2024 revision to its RSP.

Anthropic also weakened its ASL-3 security requirements shortly before Opus 4's ASL-3 announcement, specifically no longer requiring robustness against employees stealing model weights if they already had access to "systems that process model weights."

Voluntary governance is fragile. Whether or not Anthropic’s changes to its safety and security policies are justified, voluntary commitments are not sufficient to ensure model releases are safe. There’s nothing stopping Anthropic or other AI companies from walking back critical commitments in the face of competitive pressure to rush releases.

Government

Industry

Civil Society

See also: CAIS’ X account, our paper on superintelligence strategy, our AI safety course, and AI Frontiers, a new platform for expert commentary and analysis.

Listen to the AI Safety Newsletter for free on Spotify or Apple Podcasts.

Subscribe to receive future versions.



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

谷歌 Veo 3 Anthropic Claude Opus 4 AI安全
相关文章