少点错误 01月06日
AI safety content you could create
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨了人工智能安全领域中一些关键但被忽视的差距,这些差距超出了传统的对齐问题。文章指出,除了对齐问题,还存在其他潜在的灾难性风险,如经济转型和权力分配问题。此外,文章强调了从历史案例中学习的重要性,例如核武器、资源诅咒等,以制定更全面的AI安全计划。作者还强调了清晰定义AI安全术语的必要性,并呼吁更多人关注这些被忽视的问题,包括经济学家、政治学者和军事战略家。文章提出了对AI安全计划的思考,并指出了现有计划的不足之处,鼓励社区共同努力,完善这些计划。

⚠️AI安全问题不仅限于对齐:文章强调,除了AI对齐问题,还存在其他潜在的灾难性风险,如经济转型和权力分配问题,这些问题同样需要关注和解决。

📚历史案例研究的启示:通过研究核武器、资源诅咒等历史案例,可以为AI安全计划提供宝贵的经验和教训,帮助我们更好地应对未来挑战。

📝AI安全计划的缺失与完善:文章指出,目前缺乏清晰、全面的AI安全计划,并呼吁社区共同努力,完善现有计划,包括明确的行动步骤和实现预期结果的方法。

🔤AI安全术语的清晰定义:文章强调了清晰定义AI安全术语的重要性,以便更好地理解和讨论相关问题,并建议创建易于查找的资源来解释这些术语。

Published on January 6, 2025 3:35 PM GMT

This is a (slightly chaotic and scrappy) list of gaps in AI safety literature that I think would be useful/interesting to exist. I’ve broken it down into sections:

If you think there are articles that exist covering the topics described, please verify the articles you are thinking about do meet the criteria, and then tell me.

Communication of catastrophic AI safety problems outside alignment

I’ve previously written about how alignment is not all you need. And before me, others had written great things on parts of these problems. Friends have written up articles on parts of the economic transition, and specifically the intelligence curse.

Few people appear to be working on these problems, despite them seeming extremely important and neglected - and plausibly tractable? This might be because:

    there is little understanding of these problems in the community;the problems don’t match the existing community’s skills and experiences;few people have started, so there aren’t obvious tractable in-roads to these problems; andthere aren’t organisations / structures for people to fit in to work on these problems

Tackling the first two issues might be done by spreading these messages more clearly. Corresponding semi-defined audiences:

    The existing community. Funders or decision makers at AI policy orgs might be particularly useful.People who would be useful to add to the community. These might be experts who could help by working on these problems, or at least beginning to think about them (e.g. economists, politics and international relations scholars, military/field strategists). I suspect there are many things where people from these fields will see obvious things we are missing.

There is downside risk here. We want to be particularly careful not heating up race dynamics further, particularly in messaging to the general public or people likely to make race-y decisions. For this reason I’m more excited about spreading messages about the coordination problem and economic transition problem, than the power distribution problem (see my problem naming for more context).

Case studies for analogous problems

Related to the plans above, I think we could probably get a lot of insight into building better plans by looking at other case studies through time.

Unfortunately, a lot of existing resources on case studies are:

I think people should be flexible as to what they look into here (provided you expect it has relevance to the key problems in AI safety). Some questions I brainstormed were:

Plans for AI safety

For the last few weeks, I’ve been working on trying to find plans for AI safety. They should cover the whole problem, including the major hurdles after intent alignment. Unfortunately, this has not gone well - my rough conclusion is that there aren’t any very clear and well publicised plans (or even very plausible stories) for making this go well. (More context on some of this work can be found in BlueDot Impact’s AI safety strategist job posting).

In short: what series of actions might get us to a state of existential security, and ideally at a more granular level than ‘pause’ or ‘regulate companies’.

Things that are kind of going in this direction:

However, many of them stop (or are very vague) after preventing misalignment, or don’t describe how we will achieve the intended outcomes (e.g. bringing about a pause successfully). Additionally, while there has been criticism of some of the above plans, there is relatively little consensus building on these plans, or further development to improve the plans from the community.

Building a better plan, or improving on one of these plans (not just criticising where it fails) would be really valuable.

Defining things

I run BlueDot Impact’s AI safety courses. This involves finding resources that explain what AI safety people are talking about.

There are useful concepts that people in AI safety take for granted but there are ~no easy-to-find resources explaining them well. It’d be great to fix that.

I imagine these articles primarily as definitions. You might even want to create them as a LessWrong tag, AISafety.info article, or Arbital page (although I’m not certain Arbital is still maintained?). They could be a good match for a Wikipedia page, although I don’t know if they’re big enough for this.

I think these are slightly less important to write than the other ideas, but might be a good low-stakes entrypoint.



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI安全 对齐问题 历史案例 安全计划 术语定义
相关文章