Published on July 23, 2025 3:45 PM GMT
This post assumes knowledge of Newcomb's problem. It is provided here for reference and can be skipped if you already have the background knowledge.
Background on Newcomb's Problem
Newcomb's problem is a game where you are faced with two choices: pick only an opaque box (that has either zero dollars or one million dollars), or additionally pick a transparent box that visibly contains a thousand dollars.
At first glance, it is obvious that you should always pick both boxes in order to get the extra thousand dollars.
The catch is that before you even knew you were about to play this game, someone made a prediction about what decision you were about to make, and the contents of the opaque box are based entirely on the prediction that they made of you. If they predicted you would just take the opaque box, then it will have a million dollars. Otherwise, it will have zero dollars.
Additionally, this predictor is known to be very good at predicting people's decisions in this game, and you have witnessed that they have never made a mistake in the past, even after hundreds of fair trials (there was no cheating or foul play going on - and you know this).
Now that you have been given all this information about how the game works, you are informed that the prediction for your decision has already been made before you even knew that this game existed. Accordingly, the contents of the box have been set and cannot be changed. You are now faced with the decision to take just the opaque box, or take both the opaque box and the transparent box.
After hearing this problem, most people immediately gravitate to one decision or the other, and feel completely confident in it. The issue is that this problem seems to divide people evenly on what the best thing to do is.
On one hand, it is a fact that one boxers have historically all become millionaires after one-boxing whereas two boxers have historically had nothing to show for their rationality other than a sad thousand dollars. And this isn't just historical coincidence, either. Given that the predictor is highly accurate, it is a fact that if you take the expected value of one-boxing, it is a much higher expected value than the expected value of two-boxing.
And yet, on the other hand, it is also a fact that, by the time you are presented with this choice, no one can change the contents of the opaque box anymore; therefore, for any unchangeable prediction that was made, the two-boxing decision yields an extra thousand dollars relative to the one-boxing decision.
Newcomb's problem is a thought experiment that pits expected value calculations against the dominance principle. One boxers tend to be one boxers because one boxers get rich as a result of being one boxers. Two boxers tend to be two boxers because they apply reason to this problem to see that the decision to two box yields them an extra thousand dollars. The debate has not been settled and the discourse around it rages on.
I was reading this post that sums up the discourse on Newcomb's problem quite nicely. The post argues that one boxers care about what the optimal agent type is, whereas two boxers care about what the optimal decision is. So perhaps one boxers and two boxers are simply talking past each other.
After all, everyone agrees that, if the prediction is based on your disposition, then it is optimal to have had the disposition of a one boxer at the time that the prediction was made. The difference is that the two boxer further insists that you have no control over your disposition at the time the prediction was made, and once the prediction has already been made, two-boxing is the optimal decision because it causes you to get an extra thousand dollars relative to the one-boxing decision. It seems, then, that one boxers are simply failing to see the two boxer's point because they hyper-fixate on the optimal disposition, not on the optimal decision.
I believe this conclusion is exactly backwards from the reality. It is the two boxer who places undue emphasis on disposition, whereas one boxers care only about results.
To see this, imagine if we lived in a world where everyone suffers from occasional, involuntary muscle spasms.
In this world, we present these spasmic participants with a version of Newcomb's problem where the mechanism by which you make your decision is to press one of two buttons that correspond to one-boxing or two-boxing.
We then set up a predictor who can account for these muscle spasms while still producing accurate predictions.
In this world, some of the people who one-box actually intended to two-box before they had an involuntary muscle spasm that made them accidentally press the one-box button. Call such agents the involuntary one boxers.
Despite having the disposition of a two boxer, the involuntary one boxers are still one boxers. It doesn't matter that they intended to two-box to scoop up the "extra" thousand dollars - they in fact one-boxed, and so they are a one boxer.
The predictor's goal is to predict which button gets pressed - nothing more, nothing less. It is completely possible that the predictor does not care at all about what your disposition is - there is a world in which muscle spasms are so frequent that the predictor simply needs to be good at predicting the direction of these involuntary muscle spasms without caring one bit about the agents' dispositions.
We can extend this thought to the actual world, where involuntary muscle spasms are infrequent. Whether you are a one boxer or a two boxer depends solely on whether you actually one-box or two-box. The only thing we know in Newcomb's problem is that the predictor accurately predicts decisions.
Given this, if you have the disposition of a two boxer, and somehow one-box anyway, you will reap the rewards of a one boxer.
You don't have to understand why being a one boxer works in order to acknowledge that it works. And being a one boxer is simply about actually one-boxing. In other words, you are not a one boxer merely for thinking like a one boxer. You can talk the talk, but what matters is whether you walk the walk.
If we follow two boxers' conventional line of thinking, it is clear that their argument hyper-fixates on the importance of disposition. Their argument goes:
You cannot influence what your disposition in the past was, and the prediction is made based on your disposition, so you cannot influence the contents of the opaque box. Therefore, you might as well focus on what you can control going forward, and the only relevant decision before you now is whether you grab an extra one thousand dollars or not.
It may well be true that you cannot influence what your disposition in the past was. However, the muscle spasm thought experiment clearly shows that the prediction itself can be completely unrelated to your disposition.
You may well have had the disposition of a two boxer. And yet, if you one-box anyway, you reap the rewards of a one boxer. Not only that, if you one-box anyway, you actually are a one boxer. It doesn't matter what factors led you to the point of one-boxing, what matters is that you in fact one-boxed.
From this perspective, the one boxer can agree with the fundamental tenet of focusing only on what you can control while ignoring everything that you cannot.
You may not be able to control who you were in the past, but you can control which button you press. And that is all that matters.
If you press that one box button, then you will have revealed that you were a one boxer all along.
Discuss