Published on June 10, 2025 9:50 PM GMT
Announcing a $500 bounty for work that meaningfully engages with the idea of asymmetric existential AI risk.
Background
Existential risk has been defined by the rationalist/Effective Altruist sphere as existential relative to the human species, under the premise that the continuation of the species has very high value. This provided a strong rationality (or effectiveness) grounding for big investments in AI alignment research when the risks still seemed to most people remote and obscure. However, as an apparent side-effect, "AI risk" and "risk of a misaligned AI destroying humanity" have become nearly conflated.
Over the past couple of years I have attempted to draw attention to highly asymmetric AI risks, where a small number of controllers of "aligned" (from their point of view) AI employ it to kill the rest of the human population. From the point of view of the average person, who would stand to be killed along with their children and approximately everyone they personally know, this ought to count meaningfully as existential risk. Arguably, by a similar logic to the one used to justify early alignment research, even with a low probability of such an outcome its badness justifies investment in its prevention. Furthermore, prevention by way of arresting AI development conveniently provides a two-for-one solution, also addressing the misalignment problem. Conversely, investments in ensuring successful AI "alignment" without evaluating the full destructive potential of aligned AI potentially makes the investor complicit in genocide. These points suggest a strong interest by members of the rationalist/Effective Altruist sphere (at least, based on my understanding of their stated commitments) in asymmetric existential AI risk. But so far my efforts have revealed no evidence of such interest.
This bounty is an attempt to stimulate engagement through small monetary reward(s). More concretely, the goal is to promote broadly changing the status of this risk from "unacknowledged" (which could mean "possible but highly psychologically inconvenient") to "examined and assigned objective weight," even if the weight is very low.
Existing Work
My latest post on this topic, linking to a longform essay and the previous post
A 1999 book I was recently made aware of (with a focus on nanotechnology rather than AI)
Terms
I will keep this bounty open for two weeks, through June 24th, 2025, or until I feel the full amount can be fairly distributed, whichever comes first. If you are willing to help voluntarily without compensation, that would also be highly appreciated.
Any good-faith and meaningful engagement with the topic, at the object or meta-level, including effort to promote further engagement or to rebut my assertions about its neglected status, is eligible for a portion of the bounty. Tasteful cross-posting counts. Add a comment here, DM me on LessWrong, or use one of the contact methods listed at https://populectomy.ai with an unambiguous request to be rewarded.
Discuss