My theory of change for working in AI healthtech

Published on October 12, 2024 12:36 AM GMT

This post starts out pretty gloomy but ends up with some points that I feel pretty positive about. Day to day, I'm more focussed on the positive points, but awareness of the negative has been crucial to forming my priorities, so I'm going to start with those. It's mostly addressed to the EA community, but is hopefully somewhat of interest to LessWrong and the Alignment Forum as well.

My main concerns

I think AGI is going to be developed soon, and quickly. Possibly (20%) that's next year, and most likely (80%) before the end of 2029. These are not things you need to believe for yourself in order to understand my view, so no worries if you're not personally convinced of this.

(For what it's worth, I did arrive at this view through years of study and research in AI, combined with over a decade of private forecasting practice starting in 2010, and I am pretty unlikely to change my mind about it. In particular, I'm not deferring to anyone else's opinion about this. That's because I feel sufficiently clear in my understanding of the various ways AGI could be developed from here, such that the disjunction of those possibilities adds up to a pretty high level of confidence in AGI coming soon, which is not much affected by who agrees with me about it.)

I also currently think there's around a 15% chance that humanity will survive through the development of artificial intelligence. In other words, I think there's around an 85% chance that we will not survive the transition. Many factors affect this probability, so please take this as a conditional forecast rather than something humanity is collectively unable to change.

First, I think there's around a 35% chance that humanity will lose control of one of the first few AGI systems we develop, in a manner that leads to our extinction. Most (80%) of this probability (i.e., 28%) lies between now and 2030. In other words, I think there's around a 28% chance that between now and 2030, certain AI developments will "seal our fate" in the sense of guaranteeing our extinction over a relatively short period of time thereafter, with all humans dead before 2040.

The main factor that I think could reduce this loss-of-control risk is government regulation that is flexible in allowing a broad range of AI applications while rigidly prohibiting uncontrolled intelligence explosions in the form of fully automated AI research and development.

This category of extinction event, involving a concrete loss-of-control event, is something I believe is no longer neglected within the EA community compared to when I first began focussing on it in 2010, and so it's not something I'm going to spend much time elaborating on.

What I think is neglected within EA is what happens to human industries after AGI is first developed, assuming we survive that transition.

Aside from the ~35% chance of extinction we face from the initial development of AGI, I believe we face an additional 50% chance that humanity will gradually cede control of the Earth to AGI after it's developed, in a manner that leads to our extinction through any number of effects including pollution, resource depletion, armed conflict, or all three. I think most (80%) of this probability (i.e., 44%) lies between 2030 and 2040, with the death of the last surviving humans occurring sometime between 2040 and 2050. This process would most likely involve a gradual automation of industries that are together sufficient to fully sustain a non-human economy, which in turn leads to the death of humanity.

Extinction by industrial dehumanization

This category of extinction process — which is multipolar, gradual, and effectively consensual for at least a small fraction of humans — is not something I believe the EA community is taking seriously enough. So I'm going to elaborate on it here. In broader generality, it's something I've written about previously with Stuart Russell in TASRA. I've also written about it on LessWrong, in "What Multipolar Failure Looks Like", with the following diagram depicting the minimal set of industries needed to fully eliminate humans from the economy, both as producers and as consumers:

The main factor that I think could avoid this kind of industrial dehumanization is if humanity coordinates on a global scale to permanently prioritize the existence of industries that specifically serve humans and not machines — industries like healthcare, agriculture, education, and entertainment — and to prevent the hyper-competitive economic trends that AGI would otherwise unlock. Essentially, I'm aiming to achieve and sustain regulatory capture on the part of humanity as a special interest group relative to machines. Preserving industries that specifically care for humans means (a) maintaining vested commercial interests in policies that keep humans alive and well, and (b) ensuring that these industries extract adequately gains from the AI revolution over the next 5 years or so, thus radically increasing the collective capacity of the human species, enough to keep pace with machines so that we don't go "out with a whimper".

(Later in this post I'll elaborate on how I'm hoping we humans can better prioritize human-specific industries, and why I'm especially excited to work in health tech.)

The reason I expect human extinction to result from industrial dehumanization in a post-AGI economy is that I expect a significant but increasingly powerful fraction of humans to be okay with that. Like, I expect 1-10% of humans will gradually and willfully tolerate the dehumanization of the global economy, in a way that empowers that fraction of humanity throughout the dehumanization process until they themselves are also dead and replaced by AI systems.

Successionism as a driver of industrial dehumanization

For lack of a better term, I'll call the attitude underlying this process successionism, referring to the acceptance of machines as a successor species replacing humanity.

There are a variety of different attitudes that can lead to successionism. For instance:

necessarily

Taken together, these various sources of successionism have a substantial potential to steer economic activities, both overtly and covertly. And, they can reinforce and/or cover for each other, in the formation of temporary alliances that advance or use AI in ways that risk or cause harm to humanity.

In particular, while the AI systems involved in an industrial dehumanization process may not be "aligned with humanity" in the sense of keeping us all around and happily in control of our destinies, the AI very well may be "aligned" in the sense of obeying successionist creators or users, who do not particularly care about humanity as a whole, and perhaps do not even prioritize their own survival very much.

One reason I'm currently anticipating this trend in the future is that I have met a surprising number of people who seem to feel okay with causing human extinction in the service of other goals. In particular I think more than 1% of AI developers feel this way, and I think maybe as high as 10% based on my personal experience from talking to hundreds of colleagues in the field, many of whom have graciously conveyed to me that they think humanity probably doesn't deserve to survive and should be replaced by AI.

The succession process would involve a major rebalancing of global industries, with a flourishing of what I call the machine economy, and a languishing of what I call the human economy. My cofounder Jaan Tallinn recently spoke about this at a United Nations gathering in New York.

The machine economy

The human economy

Economic rebalancing away from the human economy is not addressed by technical solutions to AI obedience, because of successionist humans who are roughly indifferent or even opposed to human survival.

So, while I'm glad to see people working hard on solving the obedience problem for AI systems — which helps to address much of the first category of risk involving an acute loss-of-control with the initial advent of AGI over the next few years — I remain dismayed at humanity's sustained lack of attention on how we humans can or should manage the global economy with AGI systems after they're sufficiently obedient to perform all aspects of human labor upon request.

My theory of change: confronting successionism with human-specific industries

Numerous approaches make sense to me for avoiding successionism, and arguably these are all necessary or at least helpful in avoiding successionist extinction pathways:

Social movements that celebrate and appreciate humanity, such as by spreading positive vibes that help people to enjoy their existence and delight in the flourishing of other humans.Government policies that require human involvement in industrial activities, such as for accountability purposes.Business trends that invigorate the human economy, especially healthcare, agriculture, education, entertainment, and environmental restoration.

These approaches can support each other. For example, successful businesses in (3) will have a natural motivation to advocate for regulations supporting (2) and social events fostering (1).

Currently, I think the EA movement is heavily fixated on government and technical efforts, to the point of neglecting pro-social and pro-business interventions that might even be necessary for resourceful engagement with government and tech development. In other words, EA is neglecting industrial solutions to the industrial problem of successionism.

As an example, consider the impact that AI policy efforts were having prior to ChatGPT-4, versus after. The impact of ChatGPT-4 being shipped as a product that anyone could use and benefit from vastly outstripped the combined efforts of everyone writing arguments and reports to increase awareness of AGI development in AI policy. That's because direct personal experience with something is so much more convincing than a logical or empirical argument, for most people, and it also creates logical common knowledge which is important for coordination.

Partly due to the EA community's (relative) disinterest in developing prosocial products and businesses in comparison to charities and government policies, I've not engaged much with the EA community over the past 6 years or so, even though I share certain values with many people in the community, including philanthropy.

However, I've recently been seeing more appreciation for "softer" (non-technical, non-governmental) considerations in AI risk coming from EA-adjacent readers, including some positive responses to a post I wrote called "Safety isn't safety without a social model". So, I thought it might make sense to try sharing more about how I wish the EA movement had a more diverse portfolio of approaches to AI risk, including industrial and social approaches.

For instance, amongst the many young people who have been inspired by EA to improve the world, I would love to see more people

Taking pride in the generation of products and services through feedback loops that benefit everyone affected by the loop. Founding more for-profit businesses that are committed to growing by helping people.

Note: This does not include for-profits that grow by hurting people, such as by turning people against each other and extracting profits from the conflict. Illegal arms dealers and social media companies do this. It's much better to make the good kind of for-profits that grow by helping people. I want more of those!

Hosting events that celebrate humanity, that leave people feeling happy to be alive and delighting in the happiness of others, especially kind-hearted and reasonable people who for whatever reason do not want to identify as EA or devote their whole career to EA.

Note: I've been pleased that certain EA-adjacent events I've attended over the past couple of years seem to have more of a positive vibe in this way, compared to my sense of the 2018-2022 era, which is another reason I feel more optimistic sharing this wish-list for cultural shifts that I would like to see from EA.

I suspect there can be massive flow-through effects from positive trends like these, that could help develop a healthy attitude for humanity choosing to continue its own existence and avoiding full-on successionism.

Also, the more we humans can make the world better right now, the more we can alleviate what might otherwise be a desperate dependency upon superintelligence to solve all of our problems. I think a huge amount of good can be done with the current generation of AI models, and the more we achieve that, the less compelling it will be to take unnecessary risks with rapidly advancing superintelligence. There's a flinch reaction people sometimes have against this idea, because it "feeds" the AI industry by instantiating or acknowledging more of its benefits. But I think that's too harsh of a boundary to draw between humanity and AI, and I think we (humans) will do better by taking a measured and opportunistic approach to the benefits of AI.

How I identified healthcare as the industry most relevant to caring for humans

For one thing, it's right there in the name ?

More systematically:

Healthcare, agriculture, food science, education, entertainment, and environmental restoration are all important industries that serve humans but not machines. These are industries I want to sustain and advance, in order to keep the economy caring for humans, and to avoid successionism and industrial dehumanization. Also, good business ideas that grow by helping people can often pay for themselves, and thus help diversify funding sources for doing more good.

So, first and foremost, if you see ideas for businesses that meaningfully contribute to any of those industries, please build them! At the Survival and Flourishing Fund we now make non-dilutive grants to for-profits (in exchange for zero equity), and I would love for us to find more good business ideas to support.

With that said, healthcare is my favorite human-specific industry to advance, for several reasons:

QALYs!

quality-adjusted life years

Operationalizing "alignment"

deciding what it means

Geopolitical factors

Technical depth

boundaries

this 2016 Wired interview

My company

HealthcareAgents.com,

It's okay with me if only some of the above bets pay out, as long as my colleagues and I can make a real contribution to healthcare with AI technology, and help contribute to positive attitudes and business trends that avoid successionism and industrial dehumanization in the era of AGI.

But why not just do safety work with big AI labs or governments?

You might be wondering why I'm not working full-time with big AI labs and governments to address AI risk, given that I think loss-of-control risk is around 35% likely to get us all killed, and that it's closer in time than industrial dehumanization.

First of all, this question arguably ignores most of the human economy aside from governments and AGI labs, which should be a bit of a red flag I think, even if it's a reasonable question for addressing near-term loss-of-control risk specifically.

Second, I do still spend around 1 or 1.5 workdays per week addressing the control problem, through spurts of writing, advocacy and philanthropic support for the cause, in my work for UC Berkeley and volunteering for the Survival and Flourishing Fund. That said, it's true that I am not focusing the majority of my time on addressing the nearest term sources of AI risk.

Third, a major reason for my focus on longer-term risks on the scale of 5+ years — after I'm pretty confident that AGI will already be developed — is that I feel I've been relatively successful at anticipating tech development over the past 10 years or so, and the challenges those developments would bring. So, I feel I should continue looking 5 years ahead and addressing what I'm fairly sure is coming on that timescale.

For context, I first started working to address the AI control problem in 2012, by attempting to build and finance a community of awareness about it, and later through research at MIRI in 2015 and 2016. Around that time, I concluded that multipolar AI risks would be even more neglected than unipolar risks because they are harder to operationalize. I began looking for ways to address multipolar risks, first through research in open-source game theory, then within video game environments tailored to include caretaking relationships, and now in the real-world economy with healthcare as a focus area. And sadly it took me most of the period from 2012 to 2021 to realize that I should be working on for-profit feedback loops for effecting industrial change at a global scale, through the development of helpful products and services that can keep a growing business oriented on doing good work that helps people.

Now, in 2024, the loss-of-control problem is much more imminent but also much less neglected than when I started worrying about it, so I'm even more concerned with positioning myself and my business to address problems that might not become obvious for another 5-10 years. The potential elimination of the healthcare industry in the 2030s is one of those problems, and I want to be part of the solution to it.

Fourth, even if we (humans) fail to save the whole world, I will still find it intrinsically rewarding to help a bunch of people with their health problems between now and then. In other words, I also care about healthcare in and of itself, even if humanity might somehow destroy itself soon. This caring allows me to focus myself and my team on something positive that's enjoyable to scale up and that grows by helping people, which I consider a healthy attribute for a growing business.

Fifth and finally, overall I would like to see more ambitious young people who want to improve the world with helpful feedback loops that scale into successful businesses, because industry is a lot of what drives the world, and I want morally driven people to be driving industry.

Conclusion

In summary,

support the well-being of present-day humans,spread positive vibes, andleave people valuing their own existence and delighting in the happiness of others, especially in ways that help to avoid all-out successionism with AI.

I think it's highly tractable as an AI application area,caring for the health of present-day people is intrinsically rewarding for myself and my team, andhealthcare is a great setting for operationalizing and addressing practical AI alignment problems at various scales of organization simultaneously.

Thanks for reading about why I'm working in healthtech :)

Discuss

My main concerns

Extinction by industrial dehumanization

Successionism as a driver of industrial dehumanization

My theory of change: confronting successionism with human-specific industries

How I identified healthcare as the industry most relevant to caring for humans

But why not just do safety work with big AI labs or governments?

Conclusion

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签