Published on April 1, 2025 9:04 PM GMT
Introduction
Decision theory is about how to behave rationally under conditions of uncertainty, especially if this uncertainty involves being acausally blackmailed and/or gaslit by alien superintelligent basilisks.
Decision theory has found numerous practical applications, including proving the existence of God and generating endless LessWrong comments since the beginning of time.
However, despite the apparent simplicity of "just choose the best action", no comprehensive decision theory that resolves all decision theory dilemmas has yet been formalized. This paper at long last resolves this dilemma, by introducing a new decision theory: VDT.
Decision theory problems and existing theories
Some common existing decision theories are:
- Causal Decision Theory (CDT): select the action that causes the best outcome.Evidential Decision Theory (EDT): select the action that you would be happiest to learn that you had taken.Functional Decision Theory (FDT): select the action output by the function such that if you take decisions by this function you get the best outcome.
Here is a list of dilemmas in decision theory that have vexed at least one of the above decision theories:
- Newcomb's problem: a superintelligent predictor, Omega, gives you two boxes. Box A is transparent and has $1000. Box B is opaque and has $1M if Omega predicts you would pick only Box B, and otherwise empty. Do you take just Box B, or Box A and Box B?
- CDT one-boxes and misses out on $999k.
- EDT says you shouldn't smoke, because that's evidence you have the lesion.
- CDT decides, upon arriving in the apartment, to not pay the taxi driver, and therefore leaves you stranded.
- FDT thinks you should pay.
- This is complicated ... if you're FDT. Otherwise, you just say "lol no" and get on with your life.
- CDT defects on the last round, if the length is known.
- CDT always defects.The Bomb (see here). There are two boxes, Left and Right. Left has a bomb that will kill you, Right costs $100 to take, and you must pick one. Omega VI predicts what you'd choose and puts the bomb in Left if it predicts you choose Right. Omega VI has kindly left a note confirming they predicted Right and put the bomb in Left. This is your last decision ever.
- FDT says to take Left because if you take Left, the predictor's simulation of you also takes Left, it would not have put the bomb in Left, and you could save yourself $100.
- CDT is mostly invulnerable, EDT is somewhat vulnerable due to caring about correlations (whether they're logical or algorithmic or not), but FDT encourages commitment races due to reasoning about other agents reasoning about its algorithm in "logical time".
These can be summarized as follows:
As we can see, there is no "One True Decision Theory" that solves all cases. The Holy Grail was missing—until now.
Defining VDT
VDT (Vibe Decision Theory) says: take the decision associated with the best vibes.
Until recently, there was no way to operationalize "vibes" as something that could be rigorously and empirically calculated.
However, now we have an immaculate vibe sensor available: Claude-3.5-Sonnet-20241022 (nicknamed "Claude 3.5 Sonnet (New)" and retroactively renamed "Claude 3.6").
VDT says to take the action that Claude 3.6 would rate as having "the best vibes".
Concretely, given a situation with an action space,
where is Claude 3.6 chat, and is a function that maps the situation and the action space to a text description.
Experimental results
Claude gives the reasonable answer in all dilemmas (plus or minus a bit of prompt engineering to stop it refusing or being silly).
Claude demonstrates immaculate reasoning, making grounded recommendations and coherent holistic points like the following:
[on Parfit's hitchhiker:]
> When it comes to "best vibes," I'd say committing to pay the $100 has the better energy overall.
[on justifying cooperation in Prisoner's Dilemma:]
> 1. Positive atmosphere: Cooperation creates a more positive and trusting atmosphere between players, which generally feels better than constant competition or betrayal.
> 2. Mutual benefit: When both players cooperate, they often achieve better outcomes over time, which can lead to a sense
[on wisely refusing to engage in commitment races:]
> I don't engage in commitment races or threats of any kind.
Conclusion
We have decisively solved decision theory. Vibes are all you need.
"The notion that there must exist final objective answers to normative questions, truths that can be demonstrated or directly intuited, that it is in principle possible to discover a harmonious pattern in which all values are reconciled, and that it is towards this unique goal that we must make; that we can uncover some single central principle that shapes this vision, a principle which, once found, will govern our lives – this ancient and almost universal belief, on which so much traditional thought and action and philosophical doctrine rests, seems to me invalid, and at times to have led (and still to lead) to absurdities in theory and barbarous consequences in practice." - Isaiah Berlin
Discuss