Published on April 12, 2025 2:24 PM GMT

Epistemic status: Noticing confusion

There is little discussion happening on LessWrong with regards to AI governance and outreach. Meanwhile, these efforts could buy us time to figure out technical alignment. And even if we figure out technical alignment, we still have to solve crucial governmental challenges so that totalitarian lock-in or gradual disempowerment don't become the default outcome of deploying aligned AGI.

Here's three reasons why we think we might want to shift much more resources towards governance and outreach:

1. MIRI's shift in strategy

The Machine Intelligence Research Institute (MIRI), traditionally focused on technical alignment research, has pivoted to broader outreach. They write in their 2024 end of year update:

Although we continue to support some AI alignment research efforts, we now believe that absent an international government effort to suspend frontier AI research, an extinction-level catastrophe is extremely likely.

As a consequence, our new focus is on informing policymakers and the general public about the dire situation, and attempting to mobilize a response.

EY said already in 2023 that "Pausing AI Developments Isn't Enough. We Need to Shut it All Down"

2. Even if we solve technical alignment, Gradual Disempowerment seems to make catastrophe the default outcome

The "Gradual Disempowerment" report warns that even steady, non-hostile advancements in AI could gradually erode human influence over key societal systems like the economy, culture, and governance. As AI systems become more efficient and cost-effective, they may increasingly replace human roles, leading institutions to prioritize AI-driven processes over human participation. This transition could weaken both explicit control mechanisms, like democratic participation, and implicit alignments that have historically ensured societal systems cater to human interests. The authors argue that this subtle shift, driven by local incentives and mutual reinforcement across sectors, could lead to an irreversible loss of human agency, constituting an existential risk. They advocate for proactive measures, including developing metrics to monitor human influence, implementing regulations to limit AI's autonomy in critical areas, and fostering international cooperation to ensure that AI integration doesn't compromise human agency.

This concern is not new; it expands on concerns Paul Christiano voiced already in 2019 in "What failure looks like".

3. We have evidence that the governance naysayers are badly calibrated

In a recent interview, Conjecture's Gabriel Alfour reports that Control AI simply cold mailed British MPs and Lords, got 60 meetings, and had 20 of them sign a statement to take extinction risks from AI seriously. That's a 33% conversion rate - without an existing network in politics do draw on.

Meanwhile, the implicit consensus in AI safety circles appears to be that normies are basically crazy and not worth talking to unless you have secret service-grade training in persuasion and are deeply involved in your country's political backrooms already.

Conclusion

Given these considerations, we find it surprising that LessWrong still barely addresses governance and outreach. We wonder whether it makes sense to throw much more resources at governance and outreach than is currently happening.

Or, is as much effort going into governance as into technical alignment, but we just don't manage to find the platforms where the relevant conversations happen?

Discuss

1. MIRI's shift in strategy

2. Even if we solve technical alignment, Gradual Disempowerment seems to make catastrophe the default outcome

3. We have evidence that the governance naysayers are badly calibrated

Conclusion

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签