少点错误 04月28日 05:57
The Way You Go Depends A Good Deal On Where You Want To Get To: FEP minimizes surprise about actions using preferences about the future as *evidence*
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文深入探讨了自由能量原理(FEP)及其在理解生物系统行为中的应用。文章重点解析了“暗室问题”这一FEP面临的挑战,并阐述了为何生物不会仅仅为了减少信息量而选择待在暗室。通过分析行为决策与当前自由能量的关系,以及行动如何影响未来预测,文章解释了生物如何选择当下最不令人惊讶的行动。此外,文章还讨论了预测在FEP中的作用,以及P如何通过行动影响未来预测,最终成为一个自我实现的预言。作者旨在澄清关于FEP的一些常见误解,并提供了对该理论更为细致的理解。

💡自由能量原理(FEP)认为生物系统通过最小化变分自由能来做出决策。变分自由能是关于信念和感觉数据的函数,感觉数据来源于外部世界和行动。

🤔“暗室问题”是FEP的一个悖论,即如果生物系统旨在最小化惊奇感,为何不都待在信息量极低的暗室里?文章指出,生物有内置的预测,如对光线、社交互动和活动水平的期望,暗室会违背这些预测。

🎯关键在于,FEP优化的是当前的自由能量。行动会立即影响感觉数据,因此生物选择当前最不令人惊讶的行动。例如,选择进入食物房间是因为当前选择进入食物房间比进入暗室的可能性更高,即使食物房间的未来不确定性更高。

🧠学习未来需要对当前预测的行动进行贝叶斯更新,这被称为规划即推理。行动会影响对未来的预测,因此最小化预测误差是合理的。例如,感到寒冷时,开始进入室内会减少未来感到舒适的预测误差。

Published on April 27, 2025 9:55 PM GMT

The motivation for this post came when I was reading various Scott Alexander posts on the Free Energy Principle, in which he seemed very confused about it (the title God Help Us, Let’s Try To Understand Friston On Free Energy might have hinted at that). I was intrigued but also very confused, so I fell down a rabbit hole, so to speak, trying to figure it out. I am very grateful to Active Inference: The Free Energy Principle in Mind, Brain, and Behavior, which was published after Scott's post. This open access book has this amusing section in the preface:

A Note from Karl Friston
I have a confession to make. I did not write much of this book. Or, more precisely, I was not allowed to. This book’s agenda calls for a crisp and clear writing style that is beyond me. Although I was allowed to slip in a few of my favorite words, what follows is a testament to Thomas and Giovanni, their deep understanding of the issues at hand, and, importantly, their theory of mind—in all senses.

Free Energy and the Dark Room problem

The Free Energy principle asserts that biological systems (such as trees and the human brain) have beliefs and take actions that minimize a quantity known as Variational Free Energy. Variational Free Energy is a function of beliefs (Q) and sense data (y), where sense data comes from the external world (x) and from actions.

 is meant to be the system's attempt at approximating . For simplicity, for most of this post I'll assume that the system can do perfect Bayesian inference, so that , implying that . In other words, the system is trying to minimize how surprising the sense data is.

An apparent paradox is called the Dark Room problem. A dark room has very little suprise by virtue of having very little information. So why don't all organisms hang out in them? According to Translating Predictive Coding Into Perceptual Control

The main proposed solution is to claim you have some built-in predictions (of eg light, social interaction, activity levels), and the dark room will violate those.

Friston's explanation is similar, if a bit wordy.

The free-energy principle says that we harvest sensory signals that we can predict (cf., emulation theory; Grush, 2004); ensuring we keep to well-trodden paths in the space of all the physical and physiological variables that underwrite our existence.

I am pretty sure these resolutions are don't actually resolve the problem. While technically true, they miss the core problem.

The actual resolution is this: deciding to be in a dark room minimizes future free energy, but FEP actually says that actions are minimizing the present free energy.

In other words, there has been substantial confusion about what exactly is being optimized with respect to what variables.

You can sense your actions now, and that is the only sense you can change

Usually when we think of senses, we mean sensing the external world, such as by using sight and sound and what not. However, in the Free Energy literature, this definition is typically somewhat expanded: actions also count as sense data. In this way, we can factor  as .

This is why minimizing the present free energy is even possible. No action you can take (such as entering a dark room) will immediately register in your vision or other external senses. Only your actions can immediately enter the sense data while solving this optimization problem.

You choose the least surprising action

‘The rule is, jam to-morrow and jam yesterday—but never jam to-day.’

— Lewis Carroll, "Through the Looking-Glass"

Consider a choice between the dark room on the left and a room with delicious food on the right. The problem is, you don't know what food! The door is currently closed. But you do know the food is good. In particular, you know it is more likely that you will go into the food room than the dark room (because obviously you like food better than darkness).

Consider these priors:

So, sadly, if you go into the food room, you will have more free energy than if you went to into the dark room. Do we need a fudge factor? (Would adding fudge into the mix make things worse?)

Nope. You choose the action that is currently least surprising according to your own model and preferences. Minimizing surprise now means choosing the action  that maximizes , or equivalently, minimizes . Since , "Go right" is the less surprising action choice right now. The potential future sensory surprise inside the food room is not the deciding factor for the current action.

We can even add random sensory information (like random numbers) to the right room. This will increase its future free energy, but not the present free energy of choosing right.

Likewise, if you end up in the dark room, the least surprising action is to leave immediately. The fact that you will be surprised once you leave is irrelevant.

This is also why the "humans don't like the dark" explanation is technically correct. If humans did really like darkness, P(Go left) might be higher than P(Go right). The reason it is a bad explanation is that is missed the point about the predictability of the dark room being irrelevant for the current action choice. Random sensory info might contribute billions of bits to the free energy! I am sure that Friston understands this, but I think he didn't realize that others missed that point!

P is well-calibrated

Here is another apparent problem:

So for example, suppose it’s freezing cold out, and this makes you unhappy, and so you try to go inside to get warm. FE/PC would describe this as “You naturally predict that you will be a comfortable temperature, so the cold registers as strong prediction error, so in order to minimize prediction error you go inside and get warm.” PCT would say “Your temperature set point is fixed at ‘comfortable’, the cold marks a wide deviation from your temperature set point, so in order to get closer to your set point, you go inside”.

The PCT version makes more sense to me here because the phrase “you naturally predict that you will be a comfortable temperature” doesn’t match any reasonable meaning of “predict”.

This again mixes up the present and the future, but in the other direction. When you predict you will be a comfortable temperature, that is a belief about the future. All predictions are beliefs about the future.

But didn't I just say that the Free Energy Principle is about the present? The trick is that learning about the future requires you to do a Bayesian update on your predicted actions in the present. This is known as planning-as-inference. In fact, by dropping the first term in free energy, this entire post has secretly been about planning-as-inference.

So why do people say that free energy is minimizing prediction errors? Because your actions now affect your predictions about the future.

Thus, our prior preference/prediction to be comfortable soon makes the action "start moving inside now" the most probable (least surprising) action to infer in the present. Starting to move inside now reduces our prediction error about the future, and Bayes favors hypotheses that reduce prediction errors.

And in fact, if we decide to go inside, all our beliefs turn out correct! We are cold now, we decided to go inside, and we will be comfy later. So  is a well-calibrated self-fulfilling prophecy.

Same thing for the motor cortex. How ought your muscles to move? To quote the cheshire cat, "that depends a good deal on where you want to get to." It is not telling you where your muscles are now, it is fulfilling predictions about where your muscles will be a couple fractions of a second from now (which just so happens to usually be close to where they are now).

Since  is making sensible predictions, I think “you naturally predict that you will be a comfortable temperature” is completely reasonable!

Note that  is occasionally wrong tho. For example, yesterday (true story) when I was chewing food, I thought I had partially swallowed it, when in fact I had swallowed all of it. In other words, I had a false belief that I had food in my mouth! Did that cause me to resolve the error by putting food in my mouth? No, the most likely action was to try chewing the food, which I did, causing my teeth to clack when no food was there (and presumably causing many prediction errors in my motor cortex). When I discovered the error, was my response to put food in my mouth? No, because that prediction error was now a few moments in the past. My beliefs about the past updated, but my decisions in the present would only influence the future probability, not the past probability, of having food in my mouth.



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

自由能量原理 暗室问题 贝叶斯推理 预测编码
相关文章