少点错误 01月24日
AISN #46: The Transition
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本期AI安全简报聚焦于人工智能领域的最新动态。首先,回顾了拜登政府末期和特朗普政府初期在AI政策上的交接,包括美国AI扩散框架的更新、AI基础设施建设的行政命令以及网络安全行政命令。特朗普政府撤销了拜登之前的AI行政命令,并宣布了OpenAI、软银和甲骨文的合资项目Stargate,计划投资5000亿美元用于AI基础设施。此外,AI安全中心(CAIS)和Scale AI推出了“人类最后考试”(HLE),旨在全面测试AI在学术问题上的能力。CAIS还宣布了2025年春季AI安全、伦理与社会课程。简报还涵盖了行业、政府以及研究和观点等方面的动态。

🌍美国AI政策变动:拜登政府末期发布了AI扩散框架,更新了AI相关的出口管制,将国家分为三层,限制AI芯片和模型的部署和开发。特朗普政府则撤销了拜登的AI行政命令,并宣布了Stargate项目,大力投资AI基础设施。

🧠“人类最后考试” (HLE):CAIS和Scale AI联合推出HLE,旨在作为最终综合基准,测试AI在封闭式学术问题上的能力。该测试涵盖数学、物理、计算机科学和人文科学等多个学术领域,问题难度极高,现有AI模型表现不佳。

🧑‍🏫AI安全课程:CAIS宣布了2025年春季AI安全、伦理与社会课程,旨在为参与者提供应对AI挑战的知识和工具,如非国家行为者的恶意使用AI和国际竞争导致的安全标准降低等问题。

🏢行业与政府动态:中国公司DeepSeek发布了R1模型,与OpenAI的o1在推理、数学和编码能力上竞争。英国政府发布了AI机遇行动计划,旨在加速英国AI基础设施建设。欧盟发布了GPAI行为准则的第二稿。

Published on January 23, 2025 6:09 PM GMT

Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required.

Listen to the AI Safety Newsletter for free on Spotify or Apple Podcasts.


The Transition

The transition from the Biden to Trump administrations saw a flurry of executive activity on AI policy, with Biden signing several last-minute executive orders and Trump revoking Biden’s 2023 executive order on AI risk. In this story, we review the state of play.

Trump signing first-day executive orders. Source.

The AI Diffusion Framework. The final weeks of the Biden Administration saw three major actions related to AI policy. First, the Bureau of Industry and Security released its Framework for Artificial Intelligence Diffusion, which updates the US’ AI-related export controls. The rule establishes three tiers of countries 1) US allies, 2) most other countries, and 3) arms-embargoed countries.

The three tiers described by the framework. Source.

The US itself is not subject to export controls, meaning that companies can import AI chips and develop and deploy controlled models without restriction within the US. (For more discussion of the framework, see this report from RAND.)

An AI Infrastructure EO. Second, Biden signed the executive order Advancing United States Leadership in Artificial Intelligence Infrastructure.

The executive order follows an aggressive timeline, with a goal of having new data centers operational by the end of 2027.

A Cybersecurity EO. Finally, Biden signed the executive order Strengthening and Promoting Innovation in the Nation's Cybersecurity. Among other provisions, the executive order:

Trump’s First Days in Office. The Trump Administration’s most significant official action on AI policy so far has been to revoke Biden’s 2023 executive order, Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence.

However, Trump also announced Stargate, a joint venture by OpenAI, SoftBank, and Oracle, which would invest $500 billion in AI infrastructure over the next few years. According to an announcement by OpenAI, the project will deploy $100 billion immediately. However, Elon Musk undercut the project on X by claiming the project doesn’t “actually have the money.”

CAIS and Scale AI Introduce Humanity's Last Exam

The Center for AI Safety (CAIS) and Scale AI have introduced Humanity's Last Exam (HLE), which is designed to be the final comprehensive benchmark for testing AI capabilities on closed-ended academic questions. HLE is intended to inform research and policymaking with a better understanding of frontier model capabilities, as discussed in this New York Time article.

HLE features unprecedented scope and difficulty. As state-of-the-art language models begin to achieve high accuracy on existing benchmarks like MMLU, those benchmarks fail to provide an informative measure of model capabilities. The public HLE dataset introduces over 3,000 extremely challenging questions to provide a better measure of AI capabilities at the frontier of human knowledge.

Drawing on expertise from nearly 1,000 subject matter experts across 500 institutions in 50 countries, the dataset spans dozens of academic fields including mathematics, physics, computer science, and the humanities. Questions require expert-level skills or highly specific knowledge and are designed to be impossible to answer through simple internet search. The benchmark includes both multiple-choice and exact-match questions, with about 10% featuring multimodal elements requiring image comprehension. Mathematics problems make up the largest portion of the dataset at 1,102 questions.

A few representative questions from the benchmark.

Current AI models perform poorly on HLE. State-of-the-art language models achieve low accuracy on HLE despite their strong performance on other benchmarks. DeepSeek-R1 leads at 9.4%. Models are also systemically overconfident, with calibration errors ranging from 80% to over 90%—indicating they fail to recognize when questions exceed their capabilities.

Accuracy on HLE across frontier models.

HLE questions are rigorously validated. The benchmark was developed with a multi-stage validation process to ensure question quality. First, question submissions must prove too difficult for current AI models to solve. Then, questions undergo two rounds of expert peer review, and are finally divided into public and private datasets.

HLE doesn’t represent the end of AI development. While current models perform poorly on HLE, the authors say it is plausible that, given the rate of AI development, models could exceed 50% accuracy by the end of 2025. However, they also emphasize that high performance on HLE would demonstrate expert-level capabilities and knowledge at the frontier of human knowledge, but not agential skills.

AI Safety, Ethics, and Society Course

The Center for AI Safety is excited to announce the spring session of our AI Safety, Ethics, and Society course, running from February 19th to May 9th, 2025. It follows our fall fall session last year, which included 240 participants.

This free, online course brings together exceptional participants from diverse disciplines and countries, equipping them with the knowledge and practical tools necessary to address challenges arising from AI, such as the malicious use of AI by non-state actors and the erosion of safety standards driven by international competition. The course is designed to accommodate full-time work or study, lasting 12 weeks with an expected time commitment of 5 hours per week.

The course is based on the recently published textbook, Introduction to AI Safety, Ethics, and Society, authored by CAIS Director Dan Hendrycks. It is freely available in text and audio formats.

Applications for the Spring 2025 course are now open. The final application deadline is February 5, 2025, with a priority deadline of January 31. Visit the course website to learn more and apply.

Links

Industry

Government

Research and Opinion

See also: CAIS website, X account for CAIS, our $250K Safety benchmark competition, our new AI safety course, and our feedback form. The Center for AI Safety is also hiring for several positions, including Chief Operating Officer.

Listen to the AI Safety Newsletter for free on Spotify or Apple Podcasts.

Subscribe here to receive future versions.



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI安全 AI政策 AI伦理 HLE CAIS
相关文章