少点错误 22小时前
The ultimate goal
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

文章探讨了如何将对未来人工智能(AI)风险的预见转化为实际行动,以积极应对潜在挑战。作者认为,虽然预测未来很重要,但更关键的是制定详细的解决方案路径,以减少AI带来的灾难性风险。文章分析了解决世界性难题的策略,包括向经验丰富的实践者学习、研究已执行的计划,以及通过粗略的初步计划进行迭代。作者还分享了自己从预测到理解预测,再到构建全面未来预测,最终探索预测在更广泛战略中的作用的思考过程,并强调将情景预测和量化预测整合到AI安全解决方案路径中的重要性。

🤔 解决AI风险的关键在于制定详细的“解决方案路径”,而不是仅仅关注边际风险降低。文章指出,需要制定更具体、更完善的计划,以应对AI可能带来的潜在威胁,例如建立国际监督机构、开发安全的AI芯片等。

💡作者分享了学习解决复杂问题的策略,包括借鉴数学问题解决经验,通过总结问题规范和相关理论来寻找解决方案。他还强调,对于像AI安全这样模糊、复杂、开放性问题,通过粗略计划的迭代来不断完善是可行的。

📚文章提出了三种加速学习的策略:向经验丰富的实践者请教,研究已执行的计划(例如,分析SB 1047法案的兴衰),以及从粗略的计划开始并不断迭代。这些方法旨在缩短反馈循环,从而更快地获得解决问题的技能。

Published on July 5, 2025 7:10 PM GMT

My AI forecasting work aims to improve our understanding of the future so we can prepare for it and influence it in positive directions. Yet one problem remains: how do you turn foresight into action? I’m not sure, but I have some thoughts about learning the required skills.


Say you discover existential AI risks and consider redirecting your entire career to address these threats. Seeking career guidance, you find the 80,000 Hours website, and encounter this page, which outlines two main approaches: technical AI safety research and AI governance/policy work.

You develop a career plan: "Educate yourself in governance, seek positions in policy advocacy organizations, and advocate for robust policies like whistleblower protections and transparency requirements for frontier AI labs in the US." It's a sensible, relatively robust plan that many concerned about AI pursue.

You work hard, and your efforts bear fruit! Better whistleblower protections are implemented. It’s a small, incremental improvement in governance—but that’s it. Other plans may culminate with marginal improvements in AI evaluation techniques, AI control, lab processes, public awareness, or international coordination.

Don’t get me wrong—marginal AI safety improvements are great! But these plans aren’t true AI safety solution paths. I struggle finding—or coming up with—detailed plans that culminate in outcomes like:

There are plenty of suggestions on what to aim for, yet I rarely encounter plans that are more sophisticated and concrete than “raise public awareness to pressure politicians into adopting our proposal” or “develop enabling technologies (e.g. alignment methods, or hardware assurance mechanisms) that may then hopefully be widely applied, perhaps enforced through policy.”

The vagueness is frustrating, and likely reduces success probabilities.

What Is a “Solution Path” Anyway?

By "solution path," I mean something those concerned about catastrophic AI risks can examine and think: "This looks like a reasonable, complete plan that could deliver outsized impact rather than marginal risk reduction." Something that explores the required steps and sub-goals, counters potential difficulties, and adapts to changing circumstances.

It doesn’t seem completely impossible to come up with such solution paths. Yet, I don’t know strategies for developing solutions to world-spanning problems, unfolding across years over technical, social, and political domains—seemingly insurmountable, or "impossible" problems.

How do you examine the underlying mechanisms of the world, far more complex than any mechanical clockwork, and figure out how to make improvements?

Learning Strategies

In the domain of math, you may learn some good strategies for problem solving from challenging a few difficult questions. I have a very simple process that I usually follow—I summarize problem specifications and list relevant theory. That’s it. Solutions are often obvious when you have all the required information in front of you, as long as you have a good grasp of the related theory. Complex problems require effort, but they're essentially reduced to puzzles, solved by combining the information in the right way.

Forecasting is less structured than solving math problems, and you don’t get to know your score until reality gives you an answer. Nevertheless, there are traps to avoid and principles to follow that you can learn relatively quickly—just read Superforecasting by Philip Tetlock and practice making predictions. After a while you start noticing when you are overconfident—especially in areas where you think you are knowledgeable. You may learn to recognize the trap of underestimating task completion time (planning fallacy), or when you are overestimating the probability of an outcome you prefer (goal-oriented motivated reasoning).

While the exact steps vary more for forecasting than for solving math problems, there are key rules of thumb to follow. Start with reference classes and trend lines, before considering the specific circumstances for that which you are trying to predict. Consider all relevant information before making a prediction to avoid becoming prematurely attached to a probability estimate. When making mistakes, think about what you can do better in the future instead of justifying your errors (good advice in general).

But how do you solve world-spanning, multi-year, seemingly insurmountable problems? Without years of trial and error, you need shorter feedback loops that develop the required skills. Forecasters have a clever trick for this—instead of waiting for the future to arrive to give you your prediction scores, simply compare your forecasts to predictions that you trust. You can go to the forecasting platform Metaculus and compare your predictions to the Community Predictions, that are known to be fairly accurate. (Turn off “Show Community Prediction by default on open questions” in the Metaculus settings to ensure you don’t see the Community Prediction before making your own prediction.)

Is there an equivalent trick for learning strategies solving societal problems like catastrophic AI risk?

Clever Tricks for Learning Faster

First approach: Consult experienced practitioners. Just as professors intuit promising PhD proposals from supervision experience, people who've tackled insurmountable problems might guide newcomers. Such mentors are rarer than professors, though.

Second approach: Study executed plans. Analyze what worked and what failed for previous ambitious plans to shape the future. Lessons from strategic foresight—the discipline of deliberately shaping the future—may offer relevant wisdom. Case studies on specific successes and failures related specifically to AI safety may also prove valuable, examining for instance the rise and fall of the SB 1047 bill, or what led to the formation of the AI Safety Institutes.

Third approach: Start with rough plans and iterate. Create an initial vague—and probably stupid—plan and refine it over time. This is a backwards way of doing things, where you initially don’t know exactly what you are solving. It’s much easier to start out with a concrete problem formulation with a constrained solution space. But it might work for fuzzy, complicated, open-ended problems like “AI safety”. If each iteration moves the plan a little closer to something more complete and likely of succeeding, then each iteration also provides experience in strategic thinking—enabling the desired shorter feedback loop[1].

This third approach has the additional benefit of already being a strategy for solving complicated problems, not just a way of learning how to solve such problems.

Time for Reflection

One of my core motivations for doing forecasting—and writing this blog—is to better understand what can be done to ensure AI safety.

Initially, I believed that specific accurate predictions were the most strategically valuable output, so I focused on collecting forecasts from platforms with strong track records—as seen in my first four posts.

However, I increasingly recognized that good analyses matter as much as the predictions themselves—finding out the reasons some events are likely/unlikely, or why some things are especially hard to foresee. This led me to explore forecasting methodologies (like scenario forecasting) and tackle more complex subjects (like emergent alignment).

Yet even sophisticated analyses have limitations. Forecasting reaches its full potential only when used to explore possible futures—as demonstrated in the AI 2027 scenario. While often relying on probabilistic predictions, scenario work illuminates the underlying dynamics shaping what's to come. Thus, the Rogue Replication Timeline.

My post on Powerful Predictions (exploring forecasting’s applications in AI and AI safety), and the post you are reading now, places forecasting within a broader strategic context.

Though there are exceptions to the trend, there a clear arc: examining predictionsunderstanding predictionsaggregating predictions into comprehensive future projectionsexploring forecasting's broader strategic role[2].

Looking at this pattern and the ultimate goal of my forecasting work, a natural next step would be to integrate scenario forecasting and quantitative predictions into comprehensive solution paths for AI safety. Starting with rough initial frameworks, then iterating while studying strategic foresight literature and learning from experienced practitioners.

Since I can predict where my focus will eventually turn, why not start immediately?


Thank you for reading! If you found value in this post, consider subscribing!

  1. ^

    This resembles my general approach to understanding concepts or problems I find very difficult. I write down my thoughts about the issue and revisit it from time to time, perhaps adding some new knowledge or perspectives each time. I don’t claim to understand exactly how, but after enough revisits understanding often emerges. My brain finds ways to reason about it. Maybe this works for insurmountable problems too?

  2. ^

    I realize this arc may be more obvious to me than others, as I have more inside knowledge to changes in my way of thinking and my priorities.



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI风险 解决方案路径 战略 预测
相关文章